Open ahariri13 opened 4 months ago
Dear @ahariri13 thank you for contacting us! Glad to hear you are using the tool.
I am very busy for the next week and haven't had a chance to look closely at your code but I know @claying has run some experiments on this dataset and would have some suggestions.
Best, Carlos
Hello ! I'm still new to learning on proteins and I was wondering how to train on the Structure Similarity Task (at least in an efficient manner) when using the graph format for PyTorch Geometric.
For loading the data i am using the following lines:
My understanding is that we need to take two graph (protein) samples, embed them and predict a regression value for the similarity. Using the PyG dataloader will batch all dictionaries together, that's why i decided to select only the protein ID to be batched, and so i removed the ['protein']['ID'] part from the target task function in structure_similarity.py. As a result, my model looks as follows:
and the evaluation function where i have to do a for loop to append the ground truths labels for the similarity values.
Of course the training is taking too long and I would appreciate any tip on how to use the protein shake package more efficiently for this task. Thanks a lot in advance !