Closed pnrobinson closed 10 months ago
@pnrobinson
I will definitely fix the in_feature and the hard coded hg37 in there!
For the inverted functions, I did that because these are all called in the "run_stats" function of compare.py, so it made the most sense to me that they emphasize what the user is trying to do. for example if I'm running:
run_stats(cohort, is_var_type, is_not_var_type, "missense", "missense", etc...)
it's more clear that you are comparing those that are type missense to those that are not type missense. Which this is also why they are in there own file rather than in other classes, so they can be easily accessed by the run_stats function.
Probably the API to this function needs to be rethought. If "is_var_type" and "is_not_vartype" completely divide up the space, then we do not need to pass two functions and at some point in the code we can go
(not is_var_type(arguments))
or something like that. Having two functions violates the one source of truth principle which is a common source of hard to track down bugs!
@lnrekerle
is_not_var_type -- it is not good to have two methods that return an inverted boolean. The client code should be re-written to use just the "is_variant_type" method. -- at a minimum the method should be like this to avoid odd bugs
is_not_var_type: return not self.is_var_type()
(Similar comments for other pairs of functions in this file)
is_var_match -- would this work better within the Patient class
def has_variant_of_type(self, variant_type:str):
The following function is hard-coding genome build 37 and will almost certainly lead to interesting bugs down the line... Also, the name of the function is "verify", but the function appears to be creating a Variant object.
def in_feature(pat, feature):
--- it would be simpler to not use the "isIn" variable as follows
In general, I wonder if we can refactor to put these functions in other classes?