oxfordmmm / piezo

Other
2 stars 0 forks source link

`predict` method working consistently for mutations which permit minor populations #6

Open philipwfowler opened 1 year ago

philipwfowler commented 1 year ago

If I have a catalogue row like LEV,gyrA@A90V:2,R then this works

$ cat.predict('gyrA@A90V:3')
{'LEV': 'R'}

but if I don't specify the number of reads it Errors out

$ cat.predict('gyrA@A90V')
...
File [~/packages/piezo/piezo/grammar_GARC1.py:361](https://file+.vscode-resource.vscode-cdn.net/Users/fowler/Dropbox/focus/2023-02-07-gumpy-refactor/~/packages/piezo/piezo/grammar_GARC1.py:361), in row_prediction(rows, predictions, priority, message, minor, verbose)
    357 if minor is None and row['MINOR'] == '':
    358     #Neither are minor populations so act normally
    359     pred = row['PREDICTION']
--> 361 elif row['MINOR'] != '' and minor < 1 and float(row['MINOR']) < 1:
    362     #We have FRS
    363     if minor >= float(row['MINOR']):
    364         #Match
    365         pred = row['PREDICTION']

TypeError: '<' not supported between instances of 'NoneType' and 'int'

The code might assume a default FRS e.g. when instantiating a catalogue, unless explicitly specified, it could define the FRS that is assumed for "majority" calls which for Clockwork to be 0.9. Then for any mutation which is passed to the predict method would be assumed to have an FRS of 0.9 -- the problem is that one would also need an "average depth" to deal with number-of-read-based criteria. That is harder to set as a default as it changes from sample to sample (and from loci to loci).

Feels like one should be able to use a simple A90V notation to mean "majority call" without having to add code that says "oh, at this codon the catalogue allows a minority population so I have to add the number of reads / FRS to the predict call".