pblischak / HyDe

Hybridization detection using phylogenetic invariants
http://hybridization-detection.readthedocs.io
MIT License
41 stars 14 forks source link

docstring of run_hyde.py #10

Closed hannesbecher closed 5 years ago

hannesbecher commented 5 years ago

Hi Paul, For me, the docstring of run_hyde.py looks like this:

Run a full hybridization detection analysis or test for hybridization in a specified set of triples.

Arguments

- infile         <string> : name of the DNA sequence data file.
- mapfile        <string> : name of the taxon map file.
- outgroup       <string> : name of the outgroup.
- nind              <int> : number of sampled individuals.
- nsites            <int> : number of sampled sites.
- ntaxa             <int> : number of sampled taxa/populations.
- triples        <string> : name of the file containing triples for testing [optional].
- prefix         <string> : name added to the beginning of output file.
- quiet            <flag> : suppress printing to stdout.
- ignore_amb_sites <flag> : ignore missing/ambiguous sites.

Output

Writes a file ('hyde-out.txt') listing each triple that was tested (P1, Hybrid, P2),
along with the corresponding Z-score, p-value, gamma estimate, and
site pattern counts.

This is not very helpful if one wants to figure out the one-letter names of the arguments. They could be stated, too, I think (in particular as they are used in the examples). Also, it would be helpful to remind the user that when using the full-lengh argument names, two hyphens are needed. Thanks, Hannes

pblischak commented 5 years ago

I added a small note to the docstrings for each script stating that more details for the arguments, including short vs. long forms and hyphenation, can be found by reading the help message:

For more details on script arguments, type: run_hyde.py -h

This is the help message I get for the run_hyde.py script:

usage: run_hyde.py [-h] -i -m -o -n -t -s [-tr] [-p]
                   [--prefix] [-q] [--ignore_amb_sites]

Options for run_hyde.py

optional arguments:
  -h, --help           show this help message and exit

required arguments:
  -i, --infile     name of the data input file
  -m, --map        map of individuals to taxa
  -o, --outgroup   name of the outgroup (only one accepted)
  -n, --num_ind    number of individuals in data matrix
  -t, --num_taxa   number of taxa (species, OTUs)
  -s, --num_sites  number of sites in the data matrix

additional arguments:
  -tr, --triples   table of triples to be analyzed
  -p, --pvalue     p-value cutoff for test of significance [default=0.05]
  --prefix           prefix appended to output files [default=hyde]
  -q, --quiet          suppress printing to stdout
  --ignore_amb_sites   ignore missing/ambiguous sites
hannesbecher commented 5 years ago

Looks great, thank you!