Open wjs20 opened 2 years ago
@wjs20, did you get any further, by any chance?
Unfortunately not.. did you have a stab at it?
@wjs20, a full broadsword thrust... Had to correct probably all the modules, eg the boolean arguments in Arguments.py, that are wrongly coded -very fun to ask for True and get a False-, realised that it was not only hard-coded, but that the code failed with new training data, and after a long while and lots of edits for the memory leak -that in the end could not fix, but is in the training loop-, I trained the model with the DIPS-plus dataset and got above 0.8 auroc. If you are after finding the PPI surface give pesto a go (https://pesto.epfl.ch/). The website will take you to the paper, and the paper to the github. The github requires work to figure out, but nowhere as much as dmasif.
Yeah I wasted alot of time on this just trying to fix all the code that was broken. Thanks for the tip, will try pesto.
@rubenalv Hi, may I ask where you are saying there is a problem? I can run dmasif_search, but the mean auroc value of the prediction result is only about 0.55, have you found out what's wrong, thanks you very much
Hi, I do not have access to my computer right now, but I can say that there are tons of problems. For one, check you arguments.py because you may not even be running the search (the argument is defined as boolean, wrongly (check the docs for the arguments library), so the output is not always what you expect). You can insert a print(params) in the main script to see if search==True and check the other booleans. But I only recommend going into using dmasif if you have time to troubleshoot it, otherwise the masif software is better documented and mainstream.
On Tue, Dec 26, 2023, 08:36 Chen Zhiyi @.***> wrote:
@rubenalv https://github.com/rubenalv Hi, may I ask where you are saying there is a problem? I can run dmasif_search, but the mean auroc value of the prediction result is only about 0.55, have you found out what's wrong, thanks you very much
— Reply to this email directly, view it on GitHub https://github.com/FreyrS/dMaSIF/issues/37#issuecomment-1869331778, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGHTHISARD4NXFD5TR4VDSTYLJ47FAVCNFSM6AAAAAARJGKH5KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRZGMZTCNZXHA . You are receiving this because you were mentioned.Message ID: @.***>
@rubenalv Thank you very much for your reply. I went and checked the Arguements file, I reran the main_training and the inference PY file, printed the args and it shows that search is True. but I found the error during the run, it saves the prediction during the inference process after the prediction inputs, when saving the file, it saves the prediction results as predction = P['iface_preds']..... , but obviously the key 'iface_preds' is not generated by feeding dMaSIF for inference when search is True, so I think there might be a problem here, but still looking into it. I would be honored if you have time to communicate, my email is chenzhiyi22@mails.ucas.ac.cn. All the best!
@wjs20 hi,have you let the ROC-AUC higher? thank you
Hi orange2350. I havn't touched this repo in over a year. The code is full of bugs and very poorly written, I wouldn't trust any results that come out. The author seems to have abandoned it. I would give up on it if I were you. As @rubenalv said, Masif looks to me like it is better maintained. They are your best bet I think.
@wjs20 Hi,wjs20,thank you for your reply:)
@orange2350, on top of that, the output of dmasif --search is embeddings, not a probability of binding like with --site, so you still need to figure out how to use those embeddings. If you have the time, skill and really want dmasif over masif or newer ones, expect 2 weeks to 2 months to sort out all the problems, and then you'll probably would like to retrain the network with a better dataset (https://www.nature.com/articles/s41597-023-02409-3). Don't take it as a discouragement, not at all, but I think it's a realistic assessment of the state of the github.
@rubenalv hi,my work may be better carried out with dmasif, so I am prepared to spend a long time, thank you! And the data set you recommend! You may have forgotten, but you have helped me before with the dataset you recommended.anyway,your reply makes me feel very warm on the difficult road of scientific research and exploration. have a nice day!!:)
@orange2350, have you made any progress that you could share, by any chance? I am stuck at the point that I have the protein embeddings for protein pairs, but do not know how to implement the MLP (or something more updated) they suggest they use in the paper for assessing the interaction...
@rubenalv Hello, I have previously replicated dmasif's work and trained and evaluated it on my own dataset, but I have some other work going on now, so I haven't done this work for a while. The protein embedding you mentioned is to dot-product two embeddings, and the result of the dot-product is the prediction of their interaction interface (the closer to 1, the better), but I have forgotten some of the specific details.
Hi
I ran the below command (had to fetch the pretrained model from the dMaSIF_colab repo) to test the site prediction accuracy and the ROC-AUC was on average of 0.62, not the 0.87 you state in the paper. How did you get this figure?
I also ran the same command but for search prediction and I got an average ROC-AUC of ~0.5 (so essentially random).
It would be really helpful if you could explain how to use this tool to get the results you say you obtained in the paper.
Thanks