torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
643 stars 123 forks source link

maxseqlength is not supported by uchime_denovo command #544

Closed petaripenev closed 7 months ago

petaripenev commented 7 months ago

My version is v2.13.3_linux_x86_64. Sorry if this has already been patched with a newer version.

Minimal code to reproduce error:

printf '>seq\n%*s' 11 | tr ' ' 'A' | vsearch --uchime_denovo - --uchimeout test.txt --maxseqlength 300000

Results in the following error:

Fatal error: Invalid options to command uchime_denovo
Invalid option(s): --maxseqlength
The valid options for the uchime_denovo command are: --abskew --alignwidth --borderline --chimeras --dn --fasta_score --fasta_width --gapext --gapopen --hardmask --log --match --mindiffs --mindiv --minh --mismatch --no_progress --nonchimeras --notrunclabels --qmask --quiet --relabel --relabel_keep --relabel_md5 --relabel_sha1 --sizein --sizeout --threads --uchimealns --uchimeout --uchimeout5 --xee --xn --xsize

I am not sure if there is a reason to limit uchime_denovo from being able to be ran on longer sequences, but chimeras could result in large contigs, so it makes sense to allow that.

frederic-mahe commented 7 months ago

I can confirm that --uchime_denovo does not accept option --maxseqlength in vsearch's current version.

torognes commented 7 months ago

You're right. The maxseqlength option is missing for uchime_denovo and the other chimera detection commands.

I'll add it soon.

torognes commented 7 months ago

The current limit is 50 000 nucleotides.

I am not sure how well the uchime_denovo algorithm works on very long sequences. We are working on new algorithms for improved chimera detection in long (and accurate) sequences.

torognes commented 7 months ago

Fixed in commit 46c87a2.

The maxseqlength and the minseqlength options are now enabled for all chimera detection commands.

torognes commented 7 months ago

Available in version 2.26.0, just released.

frederic-mahe commented 7 months ago

added regression tests to our test-suite (see https://github.com/frederic-mahe/vsearch-tests/commit/c86aedb6e1698c95d53fdb3031d25ec51228a5a6)