OpenBioML / protein-lm-scaling

Other
55 stars 14 forks source link

Support MuParametrization and MuTransfer #64

Open NZ99 opened 9 months ago

NZ99 commented 9 months ago

This PR adds initial modeling-related changes to support MuP. It is incomplete, as the interaction between some modeling-related techniques is not yet clear to me (e.g. do we have to include any modifications as to make MuP compatible with AliBi?) but is shared for now as to facilitate discussion and collaboration, as agreed during community project meetings.

I'm uploading my code as is after leaving for NeurIPS -- I'll be checking whether anything important is missing or needs fixing in the code so far over the holidays. Apologies in case there are indeed issues with the code as pushed.

Important TODOs:

NZ99 commented 9 months ago

Hey @othertea, thank you very much for these comments, they are really appreciated! Also sorry for the late reply, being sick with COVID over the holidays I have only got

Also, thank you very much for the detailed inline comments! I will take a look soon.

NZ99 commented 8 months ago

Made a mistake and pushed from an older account. Anyway, I tried ezmup, but without much success. It runs into errors because it doesn't have support for embedding biases and transformers Conv1Ds out of the box, but even when fixing that it still errors out. Pity because it would have been an easy way to add support for MuP. I will keep testing it. In the meantime, I pushed some updates based on your extremely helpful review. I've kept organization as is for now, but happy to refactor depending on your (and other folks') preference. Re: MuReadout, there is also a MuSharedReadout (see also the dedicated comment mentioned above). I also want to add more documentation w.r.t. configuration options. Next up is the coordinate checking, looking into that today.

NZ99 commented 8 months ago

Added the mup refactor and the coordinate checking integration discussed previously on Discord. You can find a colab showcasing mup coordinate checking results on the model here.

NZ99 commented 8 months ago

Given that MuP now passes coordinate checking in my tests, maybe we can review (cc @othertea? would love to have a second pair of eyes given that I really don't trust myself much) and consider merging, with follow up work going to a to-be-opened second PR for optuna.

Let me know what you think :)