songlab-cal / tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.
https://www.biorxiv.org/content/10.1101/676825v1
BSD 3-Clause "New" or "Revised" License
653 stars 129 forks source link

Data licenses #56

Open tdiethe opened 4 years ago

tdiethe commented 4 years ago

Would you be able to add licenses for the datasets? In particular, the pre-trained models may themselves require licenses, depending on the dataset(s) they were trained on?

rmrao commented 4 years ago

The models are trained on Pfam, which is released under LGPL. I'm not sure about the other datasets. Academically, the appropriate thing to do is to cite the corresponding paper for each dataset.