ksahlin / NGSpeciesID

Reference-free clustering and consensus forming of long-read amplicon sequencing
GNU General Public License v3.0
49 stars 14 forks source link

samtool error [E::sam_parse1] query name too long #6

Closed lixiaopi1985 closed 3 years ago

lixiaopi1985 commented 3 years ago

Hi,

Thank you for this pipeline. However, I encountered some issue during medaka part, I don't know if you are able to fix it, I also submitted a report at medaka repo.

installed NGSpeciesID: conda create -n NGSpeciesID python=3.6 pip, conda activate NGSpeciesID, pip install NGSpeciesID, conda install --yes -c conda-forge -c bioconda "parasail-python>=1.1.10" "edlib>=1.1.2" python-edlib "medaka>=1.0.2" spoa racon minimap2 mmseqs2

command to run: NGSpeciesID --ont --consensus --medaka --fastq NanoFilt_concat_barcode05.fastq --outfolder barcode5

Constructing minimap index. [M::mm_idx_gen::0.0021.90] collected minimizers [M::mm_idx_gen::0.0052.51] sorted minimizers [M::main::0.0072.03] loaded/built the index for 1 target sequence(s) [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::0.0072.02] distinct minimizers: 326 (90.49% are singletons); average occurrences: 1.120; average spacing: 5.405 [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -I 16G -x map-ont --MD -d /media/swaggyp1985/HDD6T/VT_Projects_2020/Mock_community/mock_community_rerun_3302021/Analysis/preprocess_with_consensus/test/barcode5/consensus_reference_0.fasta.mmi /media/swaggyp1985/HDD6T/VT_Projects_2020/Mock_community/mock_community_rerun_3302021/Analysis/preprocess_with_consensus/test/barcode5/consensus_reference_0.fasta [M::main] Real time: 0.008 sec; CPU: 0.015 sec; Peak RSS: 0.003 GB [M::main::0.0041.42] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::0.0041.40] mid_occ = 4 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::0.004*1.39] distinct minimizers: 326 (90.49% are singletons); average occurrences: 1.120; average spacing: 5.405 [E::sam_parse1] query name too long [W::sam_read1] Parse error at line 3 samtools view: error reading file "-"

Thank you

ksahlin commented 3 years ago

Hi @lixiaopi1985 ,

The answer is on the third to last line in your stdout, namely [E::sam_parse1] query name too long. That is, you need to shorten your read accessions for SAM-tools to be able to parse the output. I'm not sure what the maximum allowed query accession length is though.

Best, Kristoffer

lixiaopi1985 commented 3 years ago

@ksahlin Indeed, shortening the name resolved the issue. Thank you.