Difference between two prediction results(ESP_prediction provided on Github and online server)

AlexanderKroll / ESP

MIT License

62 stars 21 forks source link

Difference between two prediction results(ESP_prediction provided on Github and online server) #13

Closed Liujing19830425 closed 4 months ago

Liujing19830425 commented 9 months ago

Dear Alex: I found that there are differences in the prediction results using two methods (ESP_prediction provided on Github and online server), some of which are quite significant. Is the reason for this difference due to the different models? Which is more reliable in this situation? About ESP_prediciton, can you provide the latest trained model?

Looking forward to your reply! Jing

maxall41 commented 5 months ago

Not the author, but I think the website uses the production model, which includes the test set in the training data. In the training notebook, there is a production model and a few different models for testing different descriptors which one are you using? Also the model weights for the production model can be found here: https://github.com/AlexanderKroll/ESP/blob/main/data/model_weights/xgboost_model_production_mode.dat

Liujing19830425 commented 5 months ago

Not the author, but I think the website uses the production model, which includes the test set in the training data. In the training notebook, there is a production model and a few different models for testing different descriptors which one are you using? Also the model weights for the production model can be found here: https://github.com/AlexanderKroll/ESP/blob/main/data/model_weights/xgboost_model_production_mode.dat

Dear Alex: First of all, thank you for your reply！ "xgboost_model_production_mode_gnn_esm1b_ts.dat" is used in the ESP_prediction program, and is also mentioned in the paper to improve the effect.The results of the test based on this model are different from the website results.So I wonder what the reasons for this difference are.

Looking forward to your reply! Jing

maxall41 commented 5 months ago

I am not the author of the paper. It is interesting that xgboost_model_production_mode_gnn_esm1b_ts.dat provides different results than the web server. Maybe @AlexanderKroll knows why this is happening?

Thanks, Max

AlexanderKroll commented 4 months ago

Hi, sorry for my late response. As you can read on the homepage of the webserver, the predictions on the webserver are made with our new ProSmith model that achieves superior prediction results.

Liujing19830425 commented 4 months ago

Dear Alex: First of all, thank you for your reply！

Is it convenient to provide this new ProSmith model to us?

Thanks, Jing

AlexanderKroll commented 4 months ago

Dear Jing,

you can find the new ProSmith model for the enzyme-substrate prediction task here: https://github.com/AlexanderKroll/ESP_ProSmith

Liujing19830425 commented 4 months ago

Dear Jing,

you can find the new ProSmith model for the enzyme-substrate prediction task here: https://github.com/AlexanderKroll/ESP_ProSmith

Thanks