griffithlab / civic-server

Backend Server for CIViC Project
MIT License
39 stars 32 forks source link

Double quotations in nightly evidence item file #677

Closed jakelever closed 3 years ago

jakelever commented 3 years ago

Hey, in the nightly evidence item file, one of the rows seems to have some weird issues when loaded in Excel. The line appears to load across two rows, but the evidence_statement column actually contains text from several rows. It's line 1786. I found it by filtering for blank options for source_type. It seems like the quotation mark at the beginning of the evidence_statement causes some problems.

I've included the line below.

PIK3CA  5290    E545K   Her2-receptor Positive Breast Cancer    0060079     Pictilisib      Predictive  Supports    D   Sensitivity/Response    "This preclinical study tested the efficacy on several breast cancer cell lines of GDC-0941 (pictilisib), a selective inhibitor of all four isoforms of class I PI3-kinases at a range of IC50 values. Sensitivity and resistance was categorized as less than and more than an IC50 of 1umol/L, respectively, with a maximum tested concentration of 20 umol/L. MDA-MB-361 is a HER2 amplified breast cancer cell line harboring PIK3CA E545K, and it had an IC50 of 0.14uM. To asses the effect of pictilisib on cell cycle progression and apoptosis, fluorescence based cell cycle and flow cytometry were used to measure the proportion of cells in various phases and the relative presence of Annexin V, a marker of apoptosis. Untreated, 68% of cells were in G1 or sub-G1. Following treatment with 1.0uM pictilisib, the proportion of cells in G1 or sub-G1 rose to 78%, and the amount of Annexin V detected rose 3 fold. A substantial reduction in phosphorylated AKT (S473) was also noted. Authors concluded that pictilisib had a modest effect on cell cycle progression and encouraged apoptosis in this cell line, and that this cell line is sensitive to pictilisib. Authors further noted that HER2 amplified cell lines were generally more sensitive, regardless of mutation, to pictilisib than basal-like breast cancer cell lines (p=0.002).  20453058    PubMed      O'Brien et al., 2010, Clin. Cancer Res.     4   accepted    2062    104 37  3   178936091   178936091   G   A   ENST00000263967.3                   75  GRCh37  PIK3CA E545K/E542K are the second most recurrent PIK3CA mutations in breast cancer, and are highly recurrent mutations in many other cancer types. E545K, and possibly the other mutations in the E545 region, may present patients with a poorer prognosis than patients with either patients with other PIK3CA variant or wild-type PIK3CA. There is also data to suggest that E545/542 mutations may confer resistance to EGFR inhibitors like cetuximab. While very prevalent, targeted therapies for variants in PIK3CA are still in early clinical trial phases.  Somatic 2021-02-12 18:32:43 UTC https://civic.genome.wustl.edu/links/evidence_items/2062    https://civic.genome.wustl.edu/links/variants/104   https://civic.genome.wustl.edu/links/genes/37
kkrysiak commented 3 years ago

I've submitted this revision to address this in future data dumps. However, this may be something we should sanitize or address in the creation of our data dumps.

kkrysiak commented 3 years ago

Also, closing this and opening up an issue on the client side where we have merged all of our active work.

kkrysiak commented 3 years ago

Since this has been migrated (https://github.com/griffithlab/civic-client/issues/1597), I'm closing this issue.