phenopackets / phenopacket-format

26 stars 10 forks source link

Create journal-example2-l1.yaml #28

Closed jmcmurry closed 8 years ago

jmcmurry commented 8 years ago

Initial commit based on Harry's journal example. Only just started; lots of vestigial bits there still.

jmcmurry commented 8 years ago

Do not merge yet.

jmcmurry commented 8 years ago

I would be grateful if @cmungall and @pnrobinson could have a look at this draft. Cheers

pnrobinson commented 8 years ago

Hi Julie, looks pretty good. I see the following issues.

  1. It seems that you are adding an evidence code for each phenotype annotation. This would offer the most flexibility, but also make the format more verbose. @cmungall can we also have the evidence code be for the entire publication and allow users to add additional specifications for individual assertions.
  2. It is also unclear who the variant is associated with in this file, patient one or two or both
  3. We should not use a Monarch URI for patient id's. For now, I think we should just use the string that is used in the original publication
  4. The variant needs to be expressed using ICSN codes. I am going to ask a colleague who does this all the time if she can advise us as to formats for ICSN and microarray stuff. variants:
    • id: _:v1 positions:
    • type: ICSN value: "46,XX,del(2)(q31.1;q33.1)"
cmungall commented 8 years ago

On 17 Feb 2016, at 22:30, Peter Robinson wrote:

Hi Julie, looks pretty good. I see the following issues.

  1. It seems that you are adding an evidence code for each phenotype annotation. This would offer the most flexibility, but also make the format more verbose. @cmungall can we also have the evidence code be for the entire publication and allow users to add additional specifications for individual assertions.

Ultimately this shifts the complexity onto the software that produces and consumes it, it has to be aware of all the different variations of writing the same thing.

And it's not much of a win if no one is hand-authoring these (we are just guinea pigs)

  1. It is also unclear who the variant is associated with in this file, patient one or two or both
  2. We should not use a Monarch URI for patient id's. For now, I think we should just use the string that is used in the original publication

See #22

First we need a syntactic way of indicating that this identifier is scoped to the packet. Otherwise when two packets are combined patients (or other entities) could be accidentally merged.

Second we need to consider the use case of referencing across files, either two phenopackets, or phenopacket to ped, vcf, etc. What are our requirements here?

  1. The variant needs to be expressed using ICSN codes. I am going to ask a colleague who does this all the time if she can advise us as to formats for ICSN and microarray stuff. variants:
  2. id: _:v1 positions:
    • type: ICSN value: "46,XX,del(2)(q31.1;q33.1)"

Are we restricting this to humans? If not, we need to ensure the scheme is unambiguous

see also #23


Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/phenopacket-format/pull/28#issuecomment-185561540

jmcmurry commented 8 years ago

@pnrobinson Re: "Monarch IDs" do you mean we should not use something like PMC4498842#patient1 but instead use just "patient 1"?

for his example @harryhoch had used "http://monarchinitiative.org/patient1" and that was consequently in the iteration 1 of my file, but long since overwritten (I agree this pattern invites all kinds of issues).

pnrobinson commented 8 years ago

I think something like PMC4498842#patient1 is fine (although I would standardise on PMIDs if possible). I think the version I was looking at still had the "http://monarchinitiative.org/patient1" in it!

cmungall commented 8 years ago

On 17 Feb 2016, at 23:14, Peter Robinson wrote:

I think something like PMC4498842#patient1 is fine (although I would standardise on PMIDs if possible). I think the version I was looking at still had the "http://monarchinitiative.org/patient1" in it!

Either way it should be a CURIE, so PMID:123456/patient1 or somesuch may be fine, but these would never be directly resolvable

jmcmurry commented 8 years ago

Either way it should be a CURIE, so PMID:123456/patient1 [...] these would never be directly resolvable

Agreed on CURIE. Are you cool with a hash delim though? (precisely to sidestep resolution issues)

cmungall commented 8 years ago

I like the idea, good

On 17 Feb 2016, at 23:37, Julie McMurry wrote:

Either way it should be a CURIE, so PMID:123456/patient1 [...] these would never be directly resolvable

Agreed on CURIE. Are you cool with a hash delim though? (precisely to sidestep resolution issues)


Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/phenopacket-format/pull/28#issuecomment-185581672