Closed Leonievb closed 2 years ago
I have restructured the --help
output a bit. The --highlight
input format now has a description. As far as I can tell, the only other description that is missing is the one for --filter-cellids
. I’ve opened #16 for that.
We can surely expand the help texts a little bit further, but we should not turn --help
into full-blown documentation. IMO, the actual explanation (if necessary) should be in the documentation. I would consider the --help
text more as a reminder of which options exist.
Here is how it looks at the moment (the "Run on 10X data" help string still needs to be improved.):
usage: trex run10x [-h] [--version] [--debug] [--genome-name NAME]
[--chromosome CHROMOSOME] [--start INT] [--end INT]
[--amplicon DIRECTORY [DIRECTORY ...]] [--samples SAMPLES] [--prefix]
[--min-length INT] [--max-hamming INT] [--jaccard-threshold VALUE]
[--filter-cellids CSV] [--keep-single-reads] [--visium]
[--output DIRECTORY] [--delete] [-l] [--umi-matrix] [--plot]
[--highlight FILE]
DIRECTORY [DIRECTORY ...]
Run on 10X data
positional arguments:
DIRECTORY Path to the input Cell Ranger directory. There must be an 'outs'
subdirectory in that directory.
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--debug Print some extra debugging messages
Input:
--genome-name NAME Name of the genome as indicated in 'cellranger count' run with
the flag --genome. Default: Auto-detected
--chromosome CHROMOSOME, --chr CHROMOSOME
Name of chromosome on which clone ID is located. Default: Last
chromosome in BAM file
--start INT, -s INT Position of first clone ID nucleotide (1-based). Default: Auto-
detected
--end INT, -e INT Position of last clone ID nucleotide (1-based). Default: Auto-
detected
--amplicon DIRECTORY [DIRECTORY ...], -a DIRECTORY [DIRECTORY ...]
Path to Cell Ranger result directory (a subdirectory 'outs' must
exist) containing sequencing of the clone ID amplicon library.
Provide these in the same order as transcriptome datasets
--samples SAMPLES Sample names separated by comma, in the same order as Cell
Ranger directories
--prefix Add sample name as prefix to cell IDs. Default: Add as suffix
Filter settings:
--min-length INT, -m INT
Minimum number of nucleotides a clone ID must have. Default: 20
--max-hamming INT Maximum hamming distance allowed for two clone IDs to be called
similar. Default: 5
--jaccard-threshold VALUE
If the Jaccard index between clone IDs of two cells is higher
than VALUE, they are considered similar. Default: 0
--filter-cellids CSV, -f CSV
CSV file containing cell IDs to keep in the analysis. This flag
enables to remove cells e.g. doublets
--keep-single-reads Keep clone IDs supported by only a single read. Default: Discard
them
--visium Adjust filter settings for 10x Visium data: Filter out clone IDs
only based on one read, but keep those with only one UMI
--output DIRECTORY, -o DIRECTORY, --name DIRECTORY, -n DIRECTORY
Name of the run directory to be created by the program. Default:
trex_run
--delete Delete the run directory if it already exists
Optional output files:
Use these options to enable creation of additional files in the output directory
-l, --loom Create also a loom-file from Cell Ranger and clone data. File
will have the same name as the run. Default: do not create a
loom file
--umi-matrix Create a UMI count matrix 'umi_count_matrix.csv' with cells as
columns and clone IDs as rows
--plot Plot the clone graph
--highlight FILE Highlight cell IDs listed in FILE (text file with one cell ID
per line) in the clone graph
I agree with you. Except for the -f
flag, everything is sufficiently explained for the --help
section
Great, I’ll close this issue then.
Many of the argument descriptions are not informative enough when calling the
--help
flag. Often an input format is missing