Open Mark-A-Taylor opened 2 years ago
Shouldn't be hard to move that tmp folder elsewhere, but I'm away from computer for the next 4 days. Can you just symlink your genome to current working directory and then run it from there?
Clever idea - but - unfortunatley the ICA (Illumina Connected Analytics) platform does not reveal that level of control to users. It is a GUI interface driving CWL configurations on the back end... I'll inquire with our ILMN tech support team to get their thoughts.
FWIW, it looks like the coding change on your end might be fairly isolated if you are open to it. Grepping inside the docker image (/venv/lib/python3.8/site-packages/funannotate/*) seems to show the only client of library.py:checkMasklowMem() is predict.py:main(). predict.py:main() has access to the output folder as 'args.out' which could be added as an input to library.py:checkMasklowMem() to allow it to redirect that tmp folder into a folder where we know the user has write perms. I would love to stay on your latest branch rather than maintain my bootleg hack.
Enjoy your break from the keyboard! Thanks again for sharing this great tool with the world.
JP: Any further thoughts on this topic? I do think your code base benefits from an enhancement to remove the assumption that the user has write permission on the folder containing his/her input genome file. I am new to github but willing to try to help execute the change and do any testing you recc if you are open to it.
Meanwhile, FWIW, I have found a way to use sed to make do the edits in a local Dockerfile that builds around your image (below). This gives me a docker image I can use in ICA without the hassle of 'docker run', file edits inside the container and then a 'docker commit':
FROM nextgenusfs/funannotate
USER root
RUN /bin/sed -i 's/checkMasklowMem(/checkMasklowMem( args.out, /g' /venv/lib/python3.8/site-packages/funannotate/predict.py RUN /bin/sed -i 's/def checkMasklowMem(/def checkMasklowMem( outputFolder, /g' /venv/lib/python3.8/site-packages/funannotate/library.py RUN /bin/sed -i 's/os.path.join(os.path.dirname(genome)/os.path.join(outputFolder/g' /venv/lib/python3.8/site-packages/funannotate/library.py
Okay, see if that works for you. There might be other errors that violate the "all writes must be in the output directory" -- so let me know if you find more.
The code changes look great but the predict command is now failing for me when building off of the new image dockerhub tag 'nextgenusfs/funannotate:latest'. I can illustrate the failure using the 'funannotate test -t all' command inside the container. STDERR barks 'ERROR: augustus --proteinprofile test failed, likely a compilation error...'. I have attached output files for:
Shout if i can do anything else to help. It is not a crisis for me - I am able to revert back to dockerhub tag 'nextgenusfs/funannotate:v1.8.9' and my workaround above. Thanks again for working on it.
I'm actively trying to fix another bug (that augustus compilation issue) on the latest docker image, it won't be stable for a little while, apologies.
JP: Thanks for the heads up. FAVOR to ask if you have time while you are tweaking the dockerhub build image.. could you add perl packages MCE::Mutex and Math::Utils to make it easier for us to layer in Genemark? e.g /venv/bin/cpanm MCE::Mutex and /venv/bin/cpanm Math::Utils Thanks again and shout if I am useful as a guinnea pig for testing.
do those have conda packages? those must be new dependencies I guess as I have others in here for GeneMark. I just haven't used it in a long time so have not kept up to date with the perl dependencies.
Is this MCE::Mutex? https://anaconda.org/bioconda/perl-mce
sorry but I am not sure. I am trying to get my head around genemark for the first time now. Those two packages (MCE::Mutex and Math::Utils) are checked as part of their 'check_install.bash' script. I'll dig into it more asap and pass on any 'insights' (if i have any).
Okay, well I just pushed those two perl modules in last commit, the docker image should have them when it finishes building in ~1 hour or so.
Thanks!! Looks better - dockerhub image 'nextgenusfs/funannotate:latest' is crunching through 'funannotate test -t all' now. I'll let you know if I hit any issues or see anything fishy. Thanks again!
heads up: it looks like the unit testing fails for BOTH the old and new dockerhub images: 'nextgenusfs/funannotate:v1.8.9' & 'nextgenusfs/funannotate:latest', resp, when grinding through 'funannotate test -t all' at the 'compare' test. Cut-n-paste from terminal window output is below.
Running funannotate compare
unit testing
Downloading: https://osf.io/7s9xh/download?version=1 Bytes: 1020999
CMD: funannotate compare -i Genome_one.gbk Genome_two.gbk Genome_three.gbk -o compare --cpus 8 --outgroup botrytis_cinerea.dikarya
[May 26 09:02 AM]: OS: Debian GNU/Linux 10, 16 cores, ~ 33 GB RAM. Python: 3.8.13
[May 26 09:02 AM]: Running 1.8.10
[May 26 09:02 AM]: Now parsing 3 genomes
[May 26 09:02 AM]: working on Genome one
[May 26 09:02 AM]: working on Genome two
[May 26 09:02 AM]: working on Genome three
[May 26 09:02 AM]: No secondary metabolite annotations found
[May 26 09:02 AM]: Summarizing PFAM domain results
[May 26 09:02 AM]: Summarizing InterProScan results
[May 26 09:02 AM]: Loading InterPro descriptions
[May 26 09:02 AM]: Summarizing MEROPS protease results
[May 26 09:02 AM]: found 4 MEROPS familes
Traceback (most recent call last):
File "/venv/bin/funannotate", line 8, in funannotate compare
test failed - check logfiles
Thanks, yes I know about that. That script is very very old and I should probably label it "unsupported", it needs a complete rewrite which I don't have time to do (support is all on my personal time).
more useless information for you.... i mispoke: the compare test still works in the 'nextgenusfs/funannotate:v1.8.9' which uses matplotlib 3.4.2 (asot 3.5.2). matplotlib chirps that deprecation is coming:
"/venv/lib/python3.8/site-packages/funannotate/library.py:7921: MatplotlibDeprecationWarning: Calling add_axes() without argument is deprecated since 3.3 and will be removed two minor releases later. You may want to use add_subplot() instead. cbar_ax = fig.add_axes(shrink=0.4)"
Folks,
Thank you VERY MUCH for this project. It has been incredibly valuable to us. I am using the tool in pipelines on a cloud platform (Illumina Connected Analytics) that is strict on file folder permissions. I only have write permissions on the '-o output' folder for a 'predict' command. However it looks like this tool is trying to create a folder in the directory it scrapes from the filepath of my input genome. This fails in my environment because I only have read permissions on the folder containing my input genome file. Conceptually I think it is 'non-intuitive' for this tool to assume that a user has write permission on the directory in the file path of his/her input genome file. I think it would be more logical to either use the '-o output' folder for workspace or accept another input argument specifying a temp folder.
Gory deets: I see predict.py:main() is calling library.py:checkMasklowMem() which creates a folder:
tmpdir = os.path.join(os.path.dirname(genome),'mask_'+str(uuid.uuid4()) os.makedirs(tmpdir)
The intention appears benign because this folder is later removed in the same function via a call to library.py:SaveRemove().
This is what I see in the dockerhub 'latest' tags as of today for both nextgenusfs/funannotate & nextgenusfs/funannotate-slim. I confirmed this was my issue by hacking the python code in my local instance of the docker container and rerunning the pipeline. Please consider enhancing this tool to work for my use case.
Thank you. Please shout if I can provide any additional details (or if I am misunderstanding something).