Closed bwalsh closed 6 years ago
The simplest approach is to bypass any normalization attempts if _parse_profile
does not return any mutations. The key issue is that MM's API appears to return all mutations in a gene when given something like "MET amp", which is confusing and problematic in our case.
@jgoecks
Quick clarification, mm does properly return no mutations
I've validated that the MM variant endpoint does not return a false hit.
I've made the following change
+++ b/harvester/cosmic_lookup_table.py
@@ -39,6 +39,9 @@ class CosmicLookup(object):
# return null
logging.warning('get_entries gene: %s, hgvs_p: %s', gene, hgvs_p)
return []
+ # ensure caller passed a hgvs_p
+ if not hgvs_p or len(hgvs_p) == 0:
+ return []
# Get lookup table.
if gene in self.gene_df_cache:
# Found gene-filtered lookup table in cache.
Ah, so the issue was that the COSMIC lookup table was returning all mutations, not MM. Your change looks reasonable. Thanks!
@bwalsh When I run this test I see:
profile >MET alterations< gene_index[i] >MET< mut_index[i] >< matches 0
profile >MET positive< gene_index[i] >MET< mut_index[i] >< matches 0
profile >MET amp< gene_index[i] >MET< mut_index[i] >< matches 0
And then the test fails with
E AssertionError: assert 3 == 0
E + where 3 = len([{'biomarker_type': 'mutant', 'geneSymbol': 'MET', 'name': 'MET '}, {'biomarker_type': 'polymorphism', 'geneSymbol': 'MET', 'name': 'MET positive'}, {'biomarker_type': 'polymorphism', 'geneSymbol': 'MET', 'name': 'MET amp'}])
Based on the way this test is written, I think it really should be 3. Why did we have 0 before?
thanks, will check tomorrow
Following up on our conversation re. jax trials.
Observations:
I've checked in a test to illustrate the problem harvester/tests/integration/test_jax_trials_features.py
I've validated that the MM variant endpoint does not return a false hit.
There are a chain of potential challenges and fixes:
To run:
$ pytest -s tests/integration/test_jax_trials_features.py::test_profiles