torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
656 stars 122 forks source link

vsearch tool detailed option in command line ? #516

Closed hafizmtalha closed 10 months ago

hafizmtalha commented 1 year ago

Hi, I am just wondering if I am using vsearch tool for examples 'vsearch --allpairs_global'. How can I check all the parameters of this tool and short description of each parameter in terminal ? sometimes its become so tedious to go back to the manual every time.

frederic-mahe commented 1 year ago

related to issue #340

frederic-mahe commented 1 year ago

One possibility is to run:

vsearch -h

It returns a list of commands, and for each command (or family of commands) a list of available options with a brief description:

...

Shuffling and sorting
  --shuffle FILENAME          shuffle order of sequences in FASTA file randomly
  --sortbylength FILENAME     sort sequences by length in given FASTA file
  --sortbysize FILENAME       abundance sort sequences in given FASTA file
 Parameters
  --maxsize INT               maximum abundance for sortbysize
  --minsize INT               minimum abundance for sortbysize
  --randseed INT              seed for PRNG, zero to use random data source (0)
  --sizein                    propagate abundance annotation from input
 Output
  --output FILENAME           output to specified FASTA file
  --relabel STRING            relabel sequences with this prefix string
  --relabel_keep              keep the old label after the new when relabelling
  --relabel_md5               relabel with md5 digest of normalized sequence
  --relabel_self              relabel with the sequence itself as label
  --relabel_sha1              relabel with sha1 digest of normalized sequence
  --sizeout                   include abundance information when relabelling
  --topn INT                  output just first n sequences
  --xsize                     strip abundance information in output

...
hafizmtalha commented 1 year ago

One possibility is to run:

vsearch -h

It returns a list of commands, and for each command (or family of commands) a list of available options with a brief description:

...

Shuffling and sorting
  --shuffle FILENAME          shuffle order of sequences in FASTA file randomly
  --sortbylength FILENAME     sort sequences by length in given FASTA file
  --sortbysize FILENAME       abundance sort sequences in given FASTA file
 Parameters
  --maxsize INT               maximum abundance for sortbysize
  --minsize INT               minimum abundance for sortbysize
  --randseed INT              seed for PRNG, zero to use random data source (0)
  --sizein                    propagate abundance annotation from input
 Output
  --output FILENAME           output to specified FASTA file
  --relabel STRING            relabel sequences with this prefix string
  --relabel_keep              keep the old label after the new when relabelling
  --relabel_md5               relabel with md5 digest of normalized sequence
  --relabel_self              relabel with the sequence itself as label
  --relabel_sha1              relabel with sha1 digest of normalized sequence
  --sizeout                   include abundance information when relabelling
  --topn INT                  output just first n sequences
  --xsize                     strip abundance information in output

...

this help doesn't give you the options for subtools. like if I am using "--makeudb_usearch" it won't give me the sub options which are specific to --makeudb_usearch only

frederic-mahe commented 1 year ago
vsearch -h

produces a long text, with the options specific to the --makeudb_usearch command (and sibling commands) at the end:

...
UDB files
  --makeudb_usearch FILENAME  make UDB file from given FASTA file
  --udb2fasta FILENAME        output FASTA file from given UDB file
  --udbinfo FILENAME          show information about UDB file
  --udbstats FILENAME         report statistics about indexed words in UDB file
 Parameters
  --dbmask none|dust|soft     mask db with dust, soft or no method (dust)
  --hardmask                  mask by replacing with N instead of lower case
  --wordlength INT            length of words for database index 3-15 (8)
 Output
  --output FILENAME           UDB or FASTA output file

In addition to these options, some general options are also applicable:

frederic-mahe commented 10 months ago

also related to issue https://github.com/torognes/vsearch/issues/417