linxingchen / cobra

A tool to raise the quality of viral genomes assembled from short-read metagenomes via resolving and joining of contigs fragmented during de novo assembly.
MIT License
61 stars 10 forks source link

KeyError #42

Open Syutenjyo opened 2 months ago

Syutenjyo commented 2 months ago

Hello, When I try to run cobra, I just get error message like this

(cobra) wk@t:~/res/20240905_cobra$ cobra-meta -f /home/wk/res/test/SPAdes/contigs.fasta -q /home/wk/res/20240905_cobra/metaSpades_contigs_top5.tsv -a metaspades -mink 21 -maxk 121 -m /home/wk/res/test/mapping/test_sorted.bam -c /home/wk/res/20240905_cobra/metaSPAdes_CoverM.txt -t 12
Traceback (most recent call last):
  File "/home/wk/miniconda3/envs/cobra/bin/cobra-meta", line 10, in <module> 
    sys.exit(main())
  File "/home/wk/miniconda3/envs/cobra/lib/python3.8/site-packages/cobra.py", line 877, in main
    if header2len[line.reference_name] > 1000:
    KeyError: 'NC_073204.1'

"NC_073204.1" is in my reference fasta file and have recorded in contig coverage file also. image

I'm not sure what's wrong with it.

linxingchen commented 2 months ago

Hi,

Thank you for your interest in COBRA.

I am not sure how you prepared your input file, but please ensure that,

(1) you use all the contigs or scaffolds (do not do length filter) for -f (2) a subset of your contigs or scaffolds for -q (3) mapping file of all contigs or scaffolds in (1) (4) depth/coverage info for all contigs or scaffolds in (1)

Based on the reported error, it is likely that the sequence of NC_073204.1 was not included in (1).

And, I am confused why you were using sequences from NCBI for COBRA analyses.

I hope this helps.

Best, LINXING

Syutenjyo commented 2 months ago

Hi,

Thank you for your interest in COBRA.

I am not sure how you prepared your input file, but please ensure that,

(1) you use all the contigs or scaffolds (do not do length filter) for -f (2) a subset of your contigs or scaffolds for -q (3) mapping file of all contigs or scaffolds in (1) (4) depth/coverage info for all contigs or scaffolds in (1)

Based on the reported error, it is likely that the sequence of NC_073204.1 was not included in (1).

And, I am confused why you were using sequences from NCBI for COBRA analyses.

I hope this helps.

Best, LINXING

Thank you very much! This helps me a lot. I used the wrong reference seq for alignment, resulting in this error.