OATML-Markslab / ProteinGym

Official repository for the ProteinGym benchmarks
MIT License
211 stars 20 forks source link

Update huggingface dataset #46

Open rmaguire31 opened 2 weeks ago

rmaguire31 commented 2 weeks ago

Hi

Really excellent work collating this benchmark (& all your excellent contributions to mutation effect prediction). My team and I have found the data really useful, and we're super excited to see the benchmark continue to grow. The data processing and collation of a wide range of assays is particularly useful.

Are you planning adding ProteinGym v1.0 as a new version to huggingface/datasets, in a similar manner to ProteinGym v0.1 (https://huggingface.co/datasets/OATML-Markslab/ProteinGym)? This is how we have currently been downloading ProteinGym, and would prefer to use the huggingface datasets interface if possible.

Kind regards Russell

pascalnotin commented 1 day ago

Hi @rmaguire31 -- thank you for the kind words!

We had pushed an updated version to HF that seemed to have remained private for a while. I just made it public here. Please let me know if it works for you / if you have any suggestions for improvement.

Kind regards, Pascal