RasmussenLab / vamb

Variational autoencoder for metagenomic binning
MIT License
242 stars 44 forks source link

ValueError: Length of TNFs and length of RPKM does not match. Verify the inputs #352

Open shaman-narayanasamy opened 1 month ago

shaman-narayanasamy commented 1 month ago

Dear authors/maintainers,

First off, thank you for developing and maintaining this wonderful tool! Your efforts are much appreciated.

I would like to inquire about an issue I am facing. Specifically, when I run the command: $ vamb --outdir U1/vamb --fasta /ibex/scratch/projects/c2188/soil_experiment/metagenomics/assembly/U1/megahit_assembly/final.contigs.fa --jgi U1/contig_depth.txt -m 1500 (error at the bottom). However, simply removing the -m 1500 parameter, i.e.: $ vamb --outdir U1/vamb --fasta /ibex/scratch/projects/c2188/soil_experiment/metagenomics/assembly/U1/megahit_assembly/final.contigs.fa --jgi U1/contig_depth.txt, seems to run (but I did not let it complete).

I tried to look for solutions in issues #48, #96, #65, but did not find a clear solution (apologies if I missed something). In addition, I also ensure that the fasta file and the JGI depth file did not contain any discrepencies:

$ grep -c "^>" /ibex/scratch/projects/c2188/soil_experiment/metagenomics/assembly/U1/megahit_assembly/final.contigs.fa
299429

$ wc -l U1/contig_depth.txt
299430 U1/contig_depth.txt

Could you please advice on how I could debug this issue? I would like to filter out the contigs shorter than 1500 bp, instead of running without any filtering.

Looking forward to hearing from you.

Best regards, Shaman

Loading TNF Minimum sequence length: 1500 Loading data from FASTA file /ibex/scratch/projects/c2188/soil_experiment/metagenomics/assembly/U1/megahit_assembly/final.contigs.fa

    Kept 34948164 bases in 10135 sequences
    Processed TNF in 1.6 seconds

Loading RPKM Loading RPKM from JGI file U1/contig_depth.txt


- The full error message produced by Vamb, if any

$ vamb --outdir U1/vamb --fasta /ibex/scratch/projects/c2188/soil_experiment/metagenomics/assembly/U1/megahit_assembly/final.contigs.fa --jgi U1/contig_depth.txt -m 1500 Traceback (most recent call last): File "/ibex/user/naras0c/conda-environments/vamb_env/bin/vamb", line 11, in sys.exit(main()) File "/ibex/user/naras0c/conda-environments/vamb_env/lib/python3.7/site-packages/vamb/main.py", line 528, in main logfile=logfile) File "/ibex/user/naras0c/conda-environments/vamb_env/lib/python3.7/site-packages/vamb/main.py", line 247, in run len(tnfs), minalignscore, minid, subprocesses, logfile) File "/ibex/user/naras0c/conda-environments/vamb_env/lib/python3.7/site-packages/vamb/main.py", line 121, in calc_rpkm raise ValueError("Length of TNFs and length of RPKM does not match. Verify the inputs") ValueError: Length of TNFs and length of RPKM does not match. Verify the inputs

jakobnissen commented 3 weeks ago

Dear @shaman-narayanasamy

Apologies for the slow reply - you caught us on holiday. The version of Vamb you're running is quite old. Would it be possible to run the latest version (v 4.1.3) instead? It's quite likely that the bug has been fixed in the latest version.

shaman-narayanasamy commented 1 week ago

Hi @jakobnissen ,

Thanks for the response. No worries!

I installed it using conda/mamba without specifying the version: mamba install -c bioconda vamb. My bad for assuming that the versions were up to date on conda/mamba:

$ mamba search -c bioconda vamb
Loading channels: done
# Name                       Version           Build  Channel             
<old versions removed for bevity> 
vamb                           3.0.2  py37h8902056_2  bioconda            
vamb                           3.0.2  py37hf01694f_0  bioconda   

I then tried pip within a conda environment, but it does not work.

I presume this is because of this note in the README:

Note: An active Conda environment can hijack your system's linker, causing an error during installation. Either deactivate conda, or delete the ~/miniconda/compiler_compats directory before installing with pip.

I don't really want to install it in my base environment, nor do I want to delete the compiler_compats file fearing that it may affect other environments that I currently have build. Is there any alternative installation method?

Your support is highly appreciated.