Open tuck82er opened 2 years ago
Jumping in on this thread because I am also running into visualization errors! In my case, the tax_vis.err output is as follows:
Traceback (most recent call last):
File "/work/hpc/users/nvp29/miniconda3/envs/eukulele/bin/EUKulele", line 4, in
I am running EUKulele in a conda environment on a HPC cluster using MAG protein-coding genes (.faa extension) with a pre-downloaded MMETSP database with the following command:
EUKulele -m mags -s MAG-SCG-faas --reference_dir /work/hpc/users/nvp29/databases/mmetsp --CPUs 20
The program seems to be looking for count data, which doesn't exist because the input is MAG .faa files.
Hi both, and thanks @nvpatin for bumping this thread, since I missed responding before! @nvpatin , could you give me the output of ‘EUKulele --version’ on your system? I dealt with a similar error recently. For @tuck82er , what are the approximate sizes of your files? Thanks again!
Hey, I'm experiencing the same issue as @tuck82er: diamond finishes successfully, but no output is produced and only tax_vis.out
contains one line informing that One of the files, final.contigs.fa, in the sample directory did not complete successfully.
Same outcome for phylodb and eukprot.
The size of the (metatranscriptomic) assembly is 1,985,223 sequences with a total length of 1,208,756,176 nt, 1.3GB of non-interleaved fasta on the disk, fasta headers are of the form '>k141_668320 flag=1 multi=4.0000 len=304'
.
I have a total 250GB RAM on the server and run EUKulele with --CPUs
20 or 40.
I do manage to get the results for smaller subsets of the input file.
I use EUKulele v. 2.0.0 from pipy (installation via conda proved to be difficult, if not impossible, with strict repo priority).
@alephreish I will get back to your concerns on memory usage very soon! But as far as installation: we have found that installation is a lot easier with mamba
; have you used mamba
before?
@akrinos Yes, I'm using mamba here as well. I noticed inconsistent behavior between different conda installations, so I suspect that it might be a problem on my side - will post an update if I find a solution.
Hi @alephreish , that's too bad, sorry for telling you something you already know! Looking forward to hearing more about what you find and digging more into the other performance issues soon. Thanks for trying the tool!
@akrinos The problem with the input size resolved itself after switching to eukulele v. 2.0.3 and diamond v. 0.9.24, not sure what it was.
EUKulele Fails on "Performing taxonomic visualization steps..."
Running EUKulele on the HPC results in a failed run, likely at the taxonomic estimation step. Here is the tail of the batch run output:
and the output for each sample taxest<sample #>.out gives:
and tax_vis.out gives :
the tax_vis.out output leads me to believe the issue is with estimation causing visualization to fail as no sequences are annotated
also, all taxest<sample #>.err are empty.
My input parameters are as follows:
and the tax-cutoffs.yaml as follows:
I'm unsure exactly why this run is failing and am not entirely sure how to diagnose the issue as I've run out of logs to search (I think). Any thoughts suggestions would be welcome!