Enhancements - Githubissues

ComparativeGenomicsToolkit / Comparative-Annotation-Toolkit

Apache License 2.0

170 stars 48 forks source link

Enhancements #261

Closed mhaukness-ucsc closed 1 year ago

mhaukness-ucsc commented 3 years ago

A bunch of minor changes (see commit messages)

improving transmap filtering for paralogs
fixing gff3s so they can be used as input to CAT
replacing frameshifts=NaN to False
making compatible with python 3.9

diekhans commented 3 years ago

if all(g in args.hal_genomes for g in args.target_genomes) is False: vs
if all(g in args.hal_genomes for g in args.target_genomes) == False:

While 'is False' is more pythonic, doing:

if not all(g in args.hal_genomes for g in args.target_genomes):

is more mathematically clear, to me, as a Luddite, is much clearer.

I don't understand this seemingly popular fear of boolean expression. The ghost of George Boole will haunt those who do it.

One problem with using 'is False' is pythonic concept of non-bool types being interpreted as a boolean value. One has to be very careful when using 'is'

>>> v = 0
>>> v is False
False
>>> v == False
True
>>> not v
True

Even worse is the behavior with None

>>> v = None
>>> v is False
False
>>> v == False
False
>>> not v
True

hence the behavior of doing a boolean expression if far more natural if one treats other values as booleans. Although I try to do:

>>> v is None
True

which I think is also clearer.

ifiddes commented 3 years ago

I would argue that if not all(g in args.hal_genomes for g in args.target_genomes) is more pythonic.

However, for the comparison of a variable that you always know to be a boolean, checking is True is better. This is because your object could implement __eq__ in a way that produces unexpected results. Checking with is guarantees you are comparing to the True/False/None singletons.

However, when interacting with pandas/numpy, using is can be bad because it provides its own version of True/False/None/NaN.

https://stackoverflow.com/questions/27276610/boolean-identity-true-vs-is-true

diekhans commented 3 years ago

I would argue that if not all(g in args.hal_genomes for g in args.target_genomes) is more pythonic.

I like that.

However, for the comparison of a variable that you always know to be a boolean, checking is True is better. This is because your object could implement __eq__ in a way that produces unexpected results. Checking with is guarantees you are comparing to the True/False/None singletons.

I use as boolean as expressions, is for None and enums, and treat numeric as numbers

if len(foo) != 0:

but you could argue I am being rather silly, due to the lack of type safety.

However, when interacting with pandas/numpy, using is can be bad because it provides its own version of True/False/None/NaN.

uggg