broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
175 stars 89 forks source link

Add additional bulk upload options for PhenoTips fields #673

Closed AliciaByrne closed 3 years ago

AliciaByrne commented 5 years ago

Would be helpful to have bulk upload options for PhenoTips fields in addition to 'HPO Terms present' and 'HPO Terms absent'. Key fields of interest (for CMG_Scott projects) are: 'Candidate Genes', 'Previously Tested Genes', and 'Diagnosis' (as below). Would also be helpful to bulk upload 'maternal/paternal ethnicity' and 'consanguinity' (below) to cross-check with inferred results from sequencing.

Scott project data is exported from REDCap to an Excel file. Some fields (eg HPO terms) are currently split over multiple cells - we can condense these to single cell, comma separated. There is also a free-text phenotype entry field with additional info not necessarily matching to HPO term - unsure where this would best fit in Seqr.

image image image

hanars commented 5 years ago

Hi Alicia,

When looking over the sample file you provided, I encountered the following concerns

Let me know what you think

AliciaByrne commented 5 years ago

Hi Hana

Thanks for your suggestions! I've responded to them below in the same order as you've raised them.

  1. Happy for all fields that need to be condensed into a single column to use ";" as the new separator.
  2. Unfortunately we're not able to integrate omim disorders into our internal database so we've been using ORDO nomenclature which you're right, does not map perfectly to omim. Let's put all the diagnoses to be added into the "additional comments" section instead of the "omim disorder" list.
  3. I think it would be good to keep the information about previous testing under "previously tested genes", so let's add the info to the comments section with the gene part left blank.
  4. Happy to format candidate gene lists with ";" separating the genes and "-" separating the gene name from the comment. Would it be helpful to do this for 'previously tested genes' too in cases where there is a gene name listed?
  5. Let's put the free text phenotypes under "Indication for referral" in the "Patient information" section, that way it's up the top and easily accessible when you open the Phenotips doc - most information here should be translated to HPO terms under major or minor phenotype but there are some that don't map exactly or aren't covered.

Thanks! :)

hanars commented 5 years ago

Re: your question on 4 "Would it be helpful to do this for 'previously tested genes' too in cases where there is a gene name listed?" I would say yes thats a good idea. If there isn't a gene for that field you could just start the line with a "-" (i.e. "- Panel" instead of just "Panel")

If you could create a new test file based on the changes discussed here and send it to me that would be really helpful. Slacking it to me the way you did the first one works fine

AliciaByrne commented 5 years ago

Sorry for the delay getting to this. Have just slacked you the new test file.

Also - we've decided to take out the free text phenotype information, as often times the clinicians will just copy and paste the autopsy report and there is sometimes identifying information. The majority of this information is captured in the HPO terms and diagnosis sections so we will not lose too much information by excluding it.

hanars commented 4 years ago

Hi, I'm sorry for the massive delay on this. As you may have realized, seqr no longer integrates directly with phenotips. Are these still data fields you would like to bulk upload directly to seqr?

hanars commented 3 years ago

Based on the new fields available in seqr now that we don't use phenotips, I think we should add the following fields as options for bulk upload. They should be added as optional to the EditHPOBulkForm, and we should rename that form to something more descriptive (maybe "Edit Phenotypes")

hanars commented 3 years ago

Additional fields requested in https://github.com/broadinstitute/seqr/issues/1759 (this new field list includes everything previously required by this ticket)

hanars commented 3 years ago

Note: Case Review Status is a protected field so bulk updating it will happen in the main "update individual" form not the "update individual metadata" form

hanars commented 3 years ago

the "update individuals" bulk upload now supports all of these fields, including as json exported from a separate phenotips instance