clingen-data-model / clingen-interpretation

Allele (variant) interpretation model and API for ClinGen
3 stars 1 forks source link

Change ReferenceSequence to @iri type #203

Closed bpow closed 6 years ago

bpow commented 6 years ago
larrybabb commented 6 years ago

accession is being changed from string to @id.

bpow commented 6 years ago

The accession property will actually need to be removed (since ReferenceSequence will no longer be a type), and wherever accessions have been given in the sheets, they now need to be moved to label (and if so, but in the label sheet), or the ids of existing ReferenceSequences will have to be converted to include the accession (with a prefix so it is namespaced).

larrybabb commented 6 years ago

Changed all RefSeq### ids to REFSEQ: @ids and moved the value to the labels sheet. Moved all RefSeq### DomainEntity records to removedDomainEntities sheet and kept old ids intact in case we need to revert later.

bpow commented 6 years ago

Changed types of InSilicoPrediction.transcript and InSilicoPredictionScore.transcript to @id (since ReferenceSequence no longer exists)

bpow commented 6 years ago

The labels sheet had REFSEQ:NM_00433.4 having two labels-- the one you would expect and also NM_000169.2. I'm assuming that there should be a separate REFSEQ:NM_000169.2, so that's what I put in lieu of the second one.

There might be an error in how RefSeq289 and RefSeq290 were converted to have caused this.

bpow commented 6 years ago

Looking through the old JSON, it appears that RefSeq290 was supposed to have been NM_000169.2.

I also went ahead and changed the referenceSequence attribute of CG-EX:Loc434 to REFSEQ:NM_000169.2 to take this into account.

larrybabb commented 6 years ago

thanks for catching all my misses. I made some minor changes (mostly commenting) to the ruby scripts to start getting a sense of ownership and comfort with the code. I just pulled, ran, spot-checked and checked-in the changes. I'm going to close this now with the idea you will re-open if there are any more REFSEQ conversion related issues.