ryanmelnyk / PyParanoid

Rapid and scalable homolog identification for bacterial genomes
MIT License
32 stars 7 forks source link

Ambiguous notation of "threshold" #11

Open pvstodghill opened 1 year ago

pvstodghill commented 1 year ago

Compare line 59 of IdentifyOrthologs.py:

if float(strain_vals.count("1"))/float(len(strain_vals)) > t:

with the documentation for --threshold: "proportion of strains to be considered an ortholog".

Should that be ">" or ">="?

I would prefer ">=" so that "--threshold 1.0" is equivalent to not specifying "--threshold" at all, but I realize that would introduce an incompatibility. So, perhaps the documentation could be clarified that the threshold is "strictly greater than"?

Thank you for your consideration.