Open GemmaTuron opened 1 year ago
Hello @GemmaTuron. I have forked and cloned the repository. Here's are the few steps i have done in refactoring the model.
Issues#1 I'm encountering while trying to run predict.py using shell script run.sh
According to Ersilia docs, run.sh file should accept three and only three arguments, namely FRAMEWORK_DIR, DATA_FILE and OUTPUT_FILE. So, i'm passing these 3 arguments while running run.sh i.e.
bash run.sh $FRAMEWORK_PATH $FRAMEWORK_PATH/test.csv $FRAMEWORK_PATH/output.csv
But i encountered error
Traceback (most recent call last):
File "/home/zakia/ersilia/eos6ao8/model/framework//predict.py", line 12, in <module>
checkpoints_dir = sys.argv[3]
IndexError: list index out of range
Because the predict.py also required checkpoints path dir as an argument, for checking purpose i directly pass checkpoints_dir a path in predict.py code and it works.
But after that i stuck at another error i.e.
Traceback (most recent call last):
File "/home/zakia/ersilia/eos6ao8/model/framework//predict.py", line 30, in <module>
mols += [Chem.MolToInchi(Chem.MolFromSmiles(r[0]))]
File "/home/zakia/miniconda3/envs/ersilia/lib/python3.7/site-packages/rdkit/Chem/inchi.py", line 179, in MolToInchi
treatWarningAsError=treatWarningAsError)
File "/home/zakia/miniconda3/envs/ersilia/lib/python3.7/site-packages/rdkit/Chem/inchi.py", line 108, in MolToInchiAndAuxInfo
inchi, retcode, message, logs, aux = rdinchi.MolToInchi(mol, options)
Boost.Python.ArgumentError: Python argument types in
rdkit.Chem.rdinchi.MolToInchi(NoneType, str)
did not match C++ signature:
MolToInchi(RDKit::ROMol mol, std::string options='')
I have gone through different forums and what i get is this error is caused while parsing SMILES string. Someone said "When you have unreasonable valences in molecules, the RDKit generates an error and returns None for the molecule:"
Still now i didn't able to resolve that issue.
Able to resolve issue
did not match C++ signature:
MolToInchi(RDKit::ROMol mol, std::string options='')
Zakia,
Was the model working before you did any modification? Please test it before running see if the Rdkit parsing error persists
For the refactoring of the code:
IF the rdkit error persists, we need to add a check so that molecules for which rdkit cannot be parsed are skipped
Hello @GemmaTuron, I have Successfully run model eos6ao8 both locally and inside ersilia environment. The parsing error is caused by an empty entry in input file which i removed. Do i now open the pull request for it?
@GemmaTuron i opened a Pull request. Please review it.
Hi @ZakiaYahya If you check the actions of the repo you will see how the model test on push and the dockerfile upload subsequently, failed. Please have a look and work on it
thanks!
Hello @GemmaTuron, I looked into the github actions and it seems like that packages are not getting installed due to dependency conflicts from conda. But i have checked the dockerfile and i didn't change anything in dockerfile , i kept it same as in original repo and it is working absolutely fine on my side. But in dockerfile, dependancies are supposed to install using 'conda'. What you suggest @GemmaTuron, would i try it using 'pip' instead of 'conda'?? may be it resolves the issue.
Hi @ZakiaYahya
Yes, you should be able to reproduce the errors with the --repo_path flag and take it from there
@GemmaTuron, I'll check it again but before PR, i did check it with repo_path too and it works fine. Let me check it again with pip too. Thanks
Hi @GemmaTuron, i did check it again with ersilia -v fetch eos6ao8 --repo_path /home/zakia/ersilia/eos6ao8 > output.log 2>&1
and it fetch successfully at my side, here's the fetch log file
output.log
and it also predicts successfully, here's the output results
out.csv
I'll try to do it with pip.
@GemmaTuron I did two things: (1). I did all installations in a new conda environment and it works too both locally and in ersilia environment. (2). I have also tried installation using pip instead of conda but pip versions are little bit different than conda and that causes clashes between the dependancies.
Have you had a look at the actions to understand what is failing within Ersilia ?
Yes, i check the github actions and from that it seems like the dependancies mentioned in dockerfile are not installing properly, because push-test gave error "No module named torch". But @emmakodes test this model and it works well both on Colab and CLI. Can you please suggests something @GemmaTuron ?
Hi @ZakiaYahya
I've re run the action and it seems to work fine, so I don't know it probably had to do with updating the workflow files. I'll close this issue as completed and wait for confirmation of the model testing!
I am opening this again because the dockerhub build failed, probably similar to what @emmakodes had in his model. Please both of you can you work in this issue ?
@GemmaTuron yeh sure, we will try to work together on the issue.
Hi @ZakiaYahya
The Docker File points to an error in the torch install, probably due to the different builds. @emmakodes had a similar issue in model eos74bo
. Check out which torch version would work on ARM64
Hello @GemmaTuron Yeh, i checked it in build fail on ersilia that in fails at torch. After your message i checked it on different forums and it shows that pytorch works on ARM64 too. We (me, @emmakodes and @samuelmaina) have come up with solution if one build fails the other works, Do i open the PR for model eos6ao8 again with changes in dockerfile? What you say?
Have you investigated which version of pytorch will work in ARM64? You say it does work for other people, so it would be good to have both versions
Yes @GemmaTuron, pytorch version 1.8 is compatible with ARM64, here's the link " https://github.com/pytorch/pytorch/issues/53357"
Currently for this specific model, i've installed pytorch v1.4.0 in a conda environment. What you suggests, should i installed v1.8 for that purpose? I don't know whether this version works with other dependancies too.
Hi @ZakiaYahya
Have you tried? I don't know if v1.8 will work with the rest of the dependencies, please try it out and let us know!
Hello @GemmaTuron, okay i'll try that with version v1.9, because last time @emmakodes said his model is failing at ARM64 with torch v.18, so i'll try torch v1.9 which is also compatible with ARM64. If it will work other dependancies i'll push the changes and open PR for it.
Hello @GemmaTuron
(1). Firstly, I have installed torch v1.9 and it is working with other dependancies too, fortunately. The model is working with torch v1.9 both locally and inside ersilia with --repo_path. Should i open the PR again on this model? So that you can check it whether ARM64 build is successful or not with torch v1.9.
(2) Secondly, to make PR on this model again, should i change the upload to docker.yaml
file as i changed it "for
separate builds for AMD64 and ARM64". As, now i have made changes directly in eos-template
and
@miquelduranfrigola is testing, what you suggest @GemmaTuron.
Thanks.
Hi @ZakiaYahya ! For the moment, until we decide the new dockerfile upload file is complete, just use the old one, open a PR and we'll see if it works!
Hi @GemmaTuron I have change the docker workflow to old one and made PR on it, kindly check it. Thanks
Hello @GemmaTuron @DhanshreeA I just saw github actions and i'm got confused for two reasons
(1). I changes pytorch version to v1.9 in dockerfile but in Build and Push test
it is still showing v1.4 (defined in previous
PR) instead of v1.9.
(2) The ARM64 is not just failing at torch, all dependancies that are mentioned in dockerfile is not installing in case of ARM64, it shows following lines i just check it in detail
#9 612.8 PackagesNotFoundError: The following packages are not available from current channels:
#9 612.8 - rdkit=2019.09.3.0
#9 612.8 PackagesNotFoundError: The following packages are not available from current channels:
#9 612.8 - dgl=0.4.3post2
#9 612.8 PackagesNotFoundError: The following packages are not available from current channels:
#9 612.8 - dgllife=0.2.3
#9 612.8 PackagesNotFoundError: The following packages are not available from current channels:
#9 612.8 - pytorch=1.4.0
And in the end it is showing this
#9 810.3 import torch
#9 810.3 ModuleNotFoundError: No module named 'torch'
Although these packages are installed/added successfully in case of AMD64
Ok thanks @ZakiaYahya for the thorough checks! We'll discuss this in tomorrow's general meeting
Sure @GemmaTuron Thanks.
Hello @ZakiaYahya and @GemmaTuron
This model has been very useful for improving the workflows.
Let's see if this one worked: https://github.com/ersilia-os/eos6ao8/actions/runs/5323399829/jobs/9641179397
Hi @miquelduranfrigola Okay, i'm checking the actions.
Hello @miquelduranfrigola The build is successful
@GemmaTuron This model seems to be resolved. Please check
They are testing the latest changes on Docker and then the model will be finished
Hi! I have opened this task as an issue so we can add here the work done and discuss if there are issues!