Closed GemmaTuron closed 1 year ago
Hi @HellenNamulinda this was on the Airtable side, sorry, nothing we could do about it. I think service is restored. i'll rerun the workflow
@HellenNamulinda I have updated the workflows but the model is failing at fetch time, could you please check? Thanks!
Hi @GemmaTuron,
The error(Model API eos43at:run did not produce an outputDGL does not detect a valid backend option. Which backend would you like to work with?
) I was getting was because of a wrong version I had selected for the Deep Graph Library(dgl)
I had put RUN pip install dgl==0.4.3
instead of RUN pip install dgl==0.4.3.post2
. While this worked locally, it failed in actions. (not sure but could be because of root permissions)
After reading the release notes for 4.3 post2, they had Rolling back interactive backend selection because automation was crashing. (This reverts to the previous behavior of assuming PyTorch when backend is not given).
However, 0.4.3.post2 is very low to be installed on arch linux(ERROR: Could not find a version that satisfies the requirement dgl==0.4.3.post2 (from versions: 1.0.1, 1.1.0, 1.1.1
). Unfortunately, choosing from these versions doesn't work for the model code as new errors are introduced(ImportError: cannot import name 'bipartite' from 'dgl'
)
So, there won't be a docker image for arm64.
A PR has been made for this update.
Thanks for the update Hellen, good job on the versioning issues. We'll go ahead without the ARM64 version then
Hello @GemmaTuron, This model returns the pic50 value, which is the cardiotoxicity of small molecules (IC50 in hERG blockade). The pIC50 predictions made by the MPNN model((
MPNNPredictor
) in this code are primarily determined by the learned parameters of the model and the input features, including global features computed using RDKit functions.While the model was trained using rdkit 2019, the RDKit version and the number of descriptors it provides do not impact the predictions since the descriptors are not used explicitly in the model. The RDKit functions utilized in this code are mainly for molecular data handling, preprocessing, and calculation of global features(net_utils.py) like Molecular Weight (MolWt), Topological Polar Surface Area (CalcTPSA), logarithm of the partition coefficient; LogP (MolLogP), and the number of hydrogen bond donors (NumHDonors). They are used to compute values that are considered part of the input features(represented as DGL) of the MPNN model.
With a test file; test.csv, the predictions are same for
rdkit 2019.09.3
, Out22.csv, also on Colab andrdkit 2022.9.5
, out19.csvLet me tag @pittmanriley and @febielin so we can be consistent with the MolGrad models.
Since rdkit 2019 is not installable using pip, I updated it to rdkit 2022. This plus the other changes work using run.sh and within ersilia(43at_cli_output2.csv).
Other changes made
These changes are reflected in the PR created. However, the Model Test on PR failed
Model API eos43at:run did not produce an outputDGL does not detect a valid backend option. Which backend would you like to work with?
yet all packages install successfully. Successfully installed torch-1.9.0What I observed is that installing packages was done twice, for instance, rdkit at Collecting rdkit==2022.9.5 and Collecting rdkit-pypi It is like there is some issue with ersilia causing this.