scikit-adaptation / skada

Domain adaptation toolbox compatible with scikit-learn and pytorch
https://scikit-adaptation.github.io/
BSD 3-Clause "New" or "Revised" License
56 stars 16 forks source link

[MRG] Implementation of 1NN reweighting and reweighting example implementation #108

Closed BuenoRuben closed 4 months ago

BuenoRuben commented 5 months ago

based on this paper: https://arxiv.org/pdf/2102.02291.pdf

codecov[bot] commented 5 months ago

Codecov Report

Merging #108 (2a68953) into main (cd7647e) will decrease coverage by 0.02%. The diff coverage is 96.92%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #108 +/- ## ========================================== - Coverage 97.49% 97.47% -0.02% ========================================== Files 47 47 Lines 4117 4165 +48 ========================================== + Hits 4014 4060 +46 - Misses 103 105 +2 ```
BuenoRuben commented 5 months ago

@antoinedemathelin I added KMM to the reweighting methods' example, and noticed that when fitting with: n_training, n_source = 20, 20; it's almost instantly fitted n_training, n_source = 30, 30; it's taking some time n_training, n_source = 50, 50; it's taking about 2 minutes, while others are taking less than 5 sec

I don't know if your method is specifically having hight complexity, so I just prefer to tell you

antoinedemathelin commented 5 months ago

@antoinedemathelin I added KMM to the reweighting methods' example, and noticed that when fitting with: n_training, n_source = 20, 20; it's almost instantly fitted n_training, n_source = 30, 30; it's taking some time n_training, n_source = 50, 50; it's taking about 2 minutes, while others are taking less than 5 sec

I don't know if your method is specifically having hight complexity, so I just prefer to tell you

Hi @BuenoRuben, Yes, that's right, the qp solver used in KMM is pretty slow when n_source increase (note that n_source = 50 corresponds to 400 samples I think). This can be fasten by using cvxopt, which we will add as an option solver. I also plan to implement the Frank-Wolfe algorithm for KMM, which really speed up the optim.

As a first workaround, maybe we can reduce the default number of max_iter to 100 instead of 1000. The algorithm will not fully converge, but it will be faster...

rflamary commented 4 months ago

Hello @BuenoRuben sory for this infinitre PR ;) could you also rename KMMAdapter  to KMMReweightAdapter and same fro KLIEP ?

After that it seems everything will be OK

BuenoRuben commented 4 months ago

should I then rename KMM into KMMReweight then?