ohsu-comp-bio / g2p-aggregator

Associations of genomic features, drugs and diseases
48 stars 11 forks source link

brca exchange "Pathogenicity_expert" filter #79

Closed bwalsh closed 6 years ago

bwalsh commented 6 years ago

The possible values:

"Benign / Little Clinical Significance" "Likely benign" "Not Yet Reviewed" "Pathogenic" "Uncertain"

We currently filter out "Not Yet Reviewed".
We discussed removing that filter, can you confirm?

jgoecks commented 6 years ago

@bwalsh That is my opinion. Happy to have @ahwagner @malachig and others chime in as well.

bwalsh commented 6 years ago

@jgoecks

Made the change - brca now has 17,584 items ( was 5,715 ) One note: unreviewed items do not have a phenotype

+++ b/harvester/brca.py
@@ -22,10 +22,10 @@ def harvest(genes=None):
         else:
             page_num = page_num + 1
             for record in payload['data']:
-                if not record['Pathogenicity_expert'] == 'Not Yet Reviewed':
-                    gene = record['Gene_Symbol']
-                    gene_data = {'gene': gene, 'brca': record}
-                    yield gene_data
+                # if not record['Pathogenicity_expert'] == 'Not Yet Reviewed':
+                gene = record['Gene_Symbol']
+                gene_data = {'gene': gene, 'brca': record}
+                yield gene_data
grmayfie commented 6 years ago

@bwalsh I get slightly different results when I run this. source:brca: 17,546 source:brca AND exists:association.phenotype.description: 5,791

Whereas g2p-test shows a slightly higher overall result from BRCA and a slightly lower count with phenotype association. Is this just related to when the harvest was run do you think?

Code looks good, just confused about number variation.

bwalsh commented 6 years ago

I'll check g2p-test tomorrow ( there were snafus uploading to it )

ahwagner commented 6 years ago

Per the group discussion today, we're reversing course on this and should exclude "Not Yet Reviewed" variants.

bwalsh commented 6 years ago

@ahwagner @mayfielg @jgoecks for your review... addressed and deployed at https://g2p-test.ddns.net