BorgwardtLab / proteinshake

Protein structure datasets for machine learning.
https://proteinshake.ai
BSD 3-Clause "New" or "Revised" License
101 stars 9 forks source link

Pfam labels #254

Open timkucera opened 1 year ago

timkucera commented 1 year ago

For pfam, there are some labels in the validation/test that are not present in the training set

mahdip72 commented 10 months ago

This is my problem as well. @timkucera I have a question, the paper says that family classification is a multi class task but some proteins have multiple Pfam annotations. For example. I have the following labels for one protein:

['PF02324', 'PF19127']

Could you explain to me which annotation should I consider? Or is it a multi label task actually.