MoreiraLAB / synpred

Full Machine Learning Pipeline for the Synpred prediction and Standalone deployment
GNU General Public License v3.0
8 stars 4 forks source link

Challenges in reproducing results #1

Open kieron15 opened 10 months ago

kieron15 commented 10 months ago

I downloaded the pretrained models from http://www.moreiralab.com/resources/synpred and tried to generate the pipeline outputs using command: python standalone_deploy_model.py standalone_example.csv. I encountered following challenges:

  1. File names of the model checkpoints used in script are different from the ones in downloaded models, especially for DL models
  2. Model checkpoints are not provided for CSS scores in the directory downloaded from http://www.moreiralab.com/resources/synpred
  3. After checking commit history and correcting the model checkpoint file names in the code (standalone_variables.py) I got the code to work for all synergy scores except CSS but the results for same example file obtained from webserver at http://www.moreiralab.com/resources/synpred/ are vastly different from the pipeline in this git repo.

Can you please update the code in this git repo with the final working version and provide the pretrained model checkpoints with best result for all synergy scores?

AJPreto commented 10 months ago

Hello! We thank you for helping us improve our software and bring it up to higher reproducibility standards! Upon addressing your concerns, we detected an issue where we forgot to update the zip file between versions. This accounts for two of the issues you mentioned: CSS scores were not present before and the other models suffered other changes, which would modify the results. We have now updated the file with the final models, which can be downloaded at the same location as before. We hope this change allows your progress and thank you, again, for your interest. Best regards, António

kieron15 commented 10 months ago

Hi @AJPreto Thank you for looking into this issue. I downloaded the updated final models from the provided link. With this resources directory, the code in git works without any changes and also provides CSS scores along with other scores. But the outputs are still different from the ones obtained using the webserver for all the synergy scores. I tested using the same example file given as standalone_example.csv in the git repo.

Can you please check if the code/ model checkpoints need to be modified to reproduce results from the webserver assuming that the results from the webserver are the final reliable results?

AJPreto commented 7 months ago

Hello @kieron15 , apologies for the delay, what are the main diferences you are seeing? Significant changes in the results per sample?