Closed brettasmi closed 1 year ago
Thanks for the bug report! As far as I can tell, that's a typo -- line 88 should read "/UNICHEM/UC_XREF.srcfiltered.txt" instead of "DrugBank/UC_XREF.srcfiltered.txt". I'm currently trying to get Babel to run on a new cluster here at RENCI, and have had to make a number of minor changes to the files to get them to run -- you can see my changes in branch add-dockerfile
or PR #37. For example, here's the diff of the changes I made to chemical.snakefile
. Try making that change on your end and let me know if that fixes the issue!
Note that the changes in branch add-dockerfile
have NOT been reviewed by Chris yet and so may be incorrect or introduce new bugs. This branch is also a work in progress, so it might also change in unexpected ways going forward. I think I have every target except for chemical
working now, so it's pretty close to being done, but please let me know (or push changes to that branch yourself) if you notice that I've got something wrong. If everything works out, I should have this PR ready for review in a week or so.
I also have the outputs from all the compendia except for chemical
, but if you'd like me to send that to you once I have it, please let me know!
Thanks @gaurav. That fix worked to the extent that the build continued a bit further, but quickly failed thereafter on a 404
when trying to download UNII. I switched over to your add-dockerfile branch and made a bit more progress after grabbing those UNII files manually as specified. Eventually, I failed on the following:
[Wed Mar 30 16:29:57 2022]
rule chemical_mesh_ids:
input: /Users/bsmith/isb/Babel/babel_downloads/MESH/mesh.nt
output: /Users/bsmith/isb/Babel/babel_downloads/chemicals/ids/MESH
jobid: 0
resources: mem_mb=4293, disk_mb=4293, tmpdir=/var/folders/07/pj89k_t935d11c0mbncb959w0000gp/T
loading mesh.nt
[Wed Mar 30 16:29:57 2022]
Error in rule chemical_mesh_ids:
jobid: 0
output: /Users/bsmith/isb/Babel/babel_downloads/chemicals/ids/MESH
RuleException:
AttributeError in line 16 of /Users/bsmith/isb/Babel/src/snakefiles/chemical.snakefile:
module 'pyoxigraph' has no attribute 'MemoryStore'
File "/Users/bsmith/isb/Babel/src/snakefiles/chemical.snakefile", line 16, in __rule_chemical_mesh_ids
File "/Users/bsmith/isb/Babel/src/createcompendia/chemicals.py", line 99, in write_mesh_ids
File "/Users/bsmith/isb/Babel/src/datahandlers/mesh.py", line 130, in write_ids
File "/Users/bsmith/isb/Babel/src/datahandlers/mesh.py", line 16, in __init__
File "/Users/bsmith/.pyenv/versions/3.7.7/lib/python3.7/concurrent/futures/thread.py", line 57, in run
I think that probably has to do with this: https://github.com/oxigraph/oxigraph/issues/57 so pyoxigraph may need to be pinned to a prior version in the requirements.txt
here, or the code could be updated, of course. I'd be happy to submit a PR, but I'm not sure the best way to do that given the quick iteration on your WIP branch.
Generally speaking, I suspect it's probably going to be best to wait til you've completed your work here instead of trying to get this to work as-is. Also, in exploring the chemical snakefile, I spied a comment stating that it requires a machine with >256G of memory, which I could do but certainly wasn't expecting at this point :)
Please let me know if I can help further in any way.
Interesting! I was just able to finish my first ever chemical
build, and I seem to be on pyoxigraph==0.2.5
from July last year, not the 3.0.0 release. Could you please try that and see if it works? Here's all my other dependencies according to pip freeze
: https://github.com/TranslatorSRI/Babel/blob/5216fe95c56928e83064b4db688622b47a53bb52/requirements.lock
Generally speaking, I suspect it's probably going to be best to wait til you've completed your work here instead of trying to get this to work as-is. That makes sense -- now that I have
chemical
working, I'll be working on polishing up this PR and submitting it to Chris for review sometime next week. So hopefully Babel should be fully functional once he's had a chance to make sure I didn't break anything :)Also, in exploring the chemical snakefile, I spied a comment stating that it requires a machine with >256G of memory, which I could do but certainly wasn't expecting at this point :)
I was unable to run chemical
on a Kubernetes node with less than 500G, so that seems accurate :). I never saw the memory usage go above 48%, though, so I'm still unclear on exactly how much memory it needs -- I think that'll take a few months more to figure out for sure.
If you'd like me to make the chemical
outputs available somewhere for you to download, please let me know! I plan to eventually upload them to https://stars.renci.org/var/babel_outputs/, but if you need them in a hurry, I can move that further up my to-do list.
@brettasmi The PR I linked to previously can now fully build the chemical compendium, and so we've merged it into the master
branch. We haven't published the results yet, since we're investigating some odd changes between this version of the Babel outputs and the previous version, but I'm happy to send you a copy of the chemical
outputs if you'd like!
@brettasmi I wanted to check back with you and ask if you were able to get the Chemical compendium working. We've just regenerated a new version of Babel, so you can also message me on Slack if you'd like me to send you a copy of those files!
@brettasmi I'm going to close this issue, but please do re-open it (or contact me on Slack) if you haven't been able to get the chemical compendium running or if you'd like a copy of our files to use.
I tried to build the chemical compendium using the documentation in the README, but it failed as follows:
I assume that this may simply be a case where the documentation is out of date, as per #32.
If there are updated instructions, even if it's just a few commands specific to the chemicals, that can be shared here ahead of an update to the README, I'd really appreciate seeing them.
Please let me know if there is anything I can clarify or contribute here. Thanks!