mmatena / m251

0 stars 0 forks source link

First paper experiment plans #14

Open mmatena opened 3 years ago

mmatena commented 3 years ago

The idea is that I'll add a comment with a motivation and description for experiments I plan to do. I'll keep adding details to the comment as the experiment gets more flushed out. When I actually run the experiment, I'll add details and results of what I actually did to issue #13.

mmatena commented 3 years ago

Impact of fine-tuning steps on merging performance

Considerations

Summary

mmatena commented 3 years ago

Impact of number of examples used to compute the fine-tuned Fisher

Considerations

mmatena commented 3 years ago

"Catastrophic remembering"

Add more details, but the idea is say we are given model 1, which is trained on task A. Then we train model 1 on task B to get model 2, which catastrophically forgets task A. We merge model 1 with model 2 to see if we can do well on both tasks. Traditional EWC would be a baseline.

mmatena commented 3 years ago

Impact of number of examples and y_samples used to compute the pretrained (e.g. MLM) Fisher

Considerations

mmatena commented 3 years ago

Automatic task weighting selection

TODO

mmatena commented 3 years ago

Compare robust Fisher computation from Task2Vec to direct computation

TODO

mmatena commented 3 years ago

Asynchronous distributed learning

Add more details, but the idea is that we partition the train set into N disjoint sets. Then we train models on each partition separately and merge them to cheaply join their work. There could be multiple steps of train then merge. Using the Empirical Fisher and computing it online during training would make this more efficient.

This federated learning paper might be relevant.

mmatena commented 3 years ago

Layer-specific merging proportions

TODO