clingen-data-model / allele

Documentation for data model of ClinGen
10 stars 2 forks source link

SimpleAlleleNameType value list #119

Closed larrybabb closed 9 years ago

larrybabb commented 9 years ago

Did we decide/discuss the values to provide here?

There was a set of values proposed in the XSD for SimpleAllele see SimpleAllele XSD

srynobio commented 9 years ago

I do see the list you're talking about.

    <xs:simpleType name="SimpleAlleleNameType-list">
        <xs:restriction base="xs:string">
            <xs:enumeration value="hgvs-genomic" />
            <xs:enumeration value="hgvs-mito" />
            <xs:enumeration value="hgvs-cdna" />
            <xs:enumeration value="hgvs-protein-1" />
            <xs:enumeration value="hgvs-protein-3" />
            <xs:enumeration value="hgvs-rna" />
            <xs:enumeration value="hgvs-ncrna" />
            <xs:enumeration value="ivs" />
            <xs:enumeration value="custom" />
        </xs:restriction>
    </xs:simpleType>

And after review the docs and this list I think I understand what you getting at.

I was taking the general approach we have in the past where we're not building the idea around a single allele naming source, but putting the pieces in place to allow one to choose what the source will be.

Your method would look like this correct:

place holder value
SimpleAllele.alleleName NM_004006.1.g.241T>C
SimpleAllele.alleleName.nameType value="hgvs-genomic"
SimpleAllele.alleleName.legacy value=0 (false)
SimpleAllele.alleleName.preferred value=1 (true)

And I guess I was using this type to describe all the elements into one

place holder value
SimpleAllele.alleleName whatever allele naming system
SimpleAllele.alleleName.nameType primary or ancillary

And upon further thought we unify the types to:

place holder value
SimpleAllele.alleleName whatever allele naming system
SimpleAllele.alleleName.nameType preferred or legacy

Which would look like:

place holder value
SimpleAllele.alleleName NM_004006.1.g.241T>C
SimpleAllele.alleleName.nameType value="perferred"

Thoughts?

larrybabb commented 9 years ago

I do think there is value at providing an enumeration for as many of the "common" ways that an allele is named. You can have both a single letter and 3 letter amino acid change on the same amino acid allele. You may also have a custom name or more than one custom name. And, there will most likely be other name types like "HLA" or "Star Allele". By providing a name-type set of values, we allow flexibility and options for adopters to see how a particular preference can be supported. Plus, it helps educate those that may not be familiar with all of the options out there, in the hopes that it sheds light on the value of providing an allele registry which canonicalizes all these variations.

The legacy and preferred flags are a bit different from name type. Legacy is simply a flag that was suggested by NCBI after they received a number of names that were important from a literature standpoint, even though the names are no longer used in common practice. They felt it was important to call these out explicitly. Simply marking them "not" preferred or "ancillary" does not convey the entire characteristics desired.

The preferred attribute is more of a flag to allow explicit control over the preferred name determined by the registry authors. Conceptually, a legacy name could be marked preferred. While in practice this may not happen, we would not want to prevent this, since these are two different characteristics.

That's my take on it. Let's review with the team and draw a line in the sand. I suppose we should add our final justifications to our documentation so that others coming in new will be able to understand the reasoning and meaning behind these attributes and values in our value set.

From: Shawn Rynearson notifications@github.com Reply-To: clingen-data-model/clingen-data-model <reply+000ea21b299d6be9d68446faecdba3cdb3383b6e02133b2892cf00000001118f23089 2a169ce052aeba3@reply.github.com> Date: Tuesday, June 9, 2015 5:56 PM To: clingen-data-model/clingen-data-model clingen-data-model@noreply.github.com Cc: Lawrence Babb larry.babb@gmail.com Subject: Re: [clingen-data-model] SimpleAlleleNameType value list (#119)

I do see the list you're talking about.

<xs:simpleType name="SimpleAlleleNameType-list">
    <xs:restriction base="xs:string">
        <xs:enumeration value="hgvs-genomic" />
        <xs:enumeration value="hgvs-mito" />
        <xs:enumeration value="hgvs-cdna" />
        <xs:enumeration value="hgvs-protein-1" />
        <xs:enumeration value="hgvs-protein-3" />
        <xs:enumeration value="hgvs-rna" />
        <xs:enumeration value="hgvs-ncrna" />
        <xs:enumeration value="ivs" />
        <xs:enumeration value="custom" />
    </xs:restriction>
</xs:simpleType>

And after review the docs and this list I think I understand what you getting at.

I was taking the general approach we have in the past where we're not building the idea around a single allele naming source, but putting the pieces in place to allow one to choose what the source will be.

Your method would look like this correct:

place holdervalue SimpleAllele.alleleNameNM_004006.1.g.241T>C SimpleAllele.alleleName.nameTypevalue="hgvs-genomic" SimpleAllele.alleleName.legacyvalue=0 (false) SimpleAllele.alleleName.preferredvalue=1 (true)

And I guess I was using this type to describe all the elements into one

place holdervalue SimpleAllele.alleleNamewhatever allele naming system SimpleAllele.alleleName.nameTypeprimary or ancillary

And upon further thought we unify the types to:

place holdervalue SimpleAllele.alleleNamewhatever allele naming system SimpleAllele.alleleName.nameTypepreferred or legacy

Thoughts?

‹ Reply to this email directly or view it on GitHub https://github.com/clingen-data-model/clingen-data-model/issues/119#issueco mment-110516419 .

srynobio commented 9 years ago

@larrybabb

I have made the changes in the docs. What do you think about this change?

    <xs:simpleType name="SimpleAlleleNameType-list">
        <xs:restriction base="xs:string">
            <xs:enumeration value="hgvs-genomic" />
            <xs:enumeration value="hgvs-mito" />
            <xs:enumeration value="hgvs-cdna" />
            <xs:enumeration value="hgvs-protein-1" />
            <xs:enumeration value="hgvs-protein-3" />
            <xs:enumeration value="hgvs-rna" />
            <xs:enumeration value="hgvs-ncrna" />
            <xs:enumeration value="ivs" />
            <xs:enumeration value="custom" />
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="SimpleAlleleNameType-list">
        <xs:restriction base="xs:string">
            <xs:enumeration value="hgvs-genomic" />
            <xs:enumeration value="hgvs-mito" />
            <xs:enumeration value="hgvs-cdna" />
            <xs:enumeration value="hgvs-protein-1" />
            <xs:enumeration value="hgvs-protein-3" />
            <xs:enumeration value="hgvs-rna" />
            <xs:enumeration value="hgvs-ncrna" />
            <xs:enumeration value="hgvs-ivs" />     <-------------*
            <xs:enumeration value="hgvs-custom" />  <-------------*
        </xs:restriction>
    </xs:simpleType>
larrybabb commented 9 years ago

i put the hgvs prefix on the original name types that would map to HGVS nomenclature for the given variant style that HGVS provides a specification for.

But, IVS and custom are not HGVS names. IVS is intervening sequence and custom is like a freeform type (like "other") to catch anything that is not one of the other specific types. Actually, maybe it should be called "other" instead of "custom".

But IVS and custom are not tied to HGVS, so I don't think the hgvs- prefix is correct for those.

srynobio commented 9 years ago

Changes have been made. I defined custom as: Custom or "other" type

ronakypatel commented 9 years ago

Are commits pushed? The .xsd doesnot look changed https://github.com/clingen-data-model/clingen-data-model/tree/master/source/main/resources/clingen-xsd/simpleallele.xsd

srynobio commented 9 years ago

I have commit changes. The Value Set list should now reflect what's in the xsd file.

--Shawn

On Wed, Jun 24, 2015 at 10:10 AM, ronakypatel notifications@github.com wrote:

Are commits pushed? The .xsd doesnot look changed https://github.com/clingen-data-model/clingen-data-model/tree/master/source/main/resources/clingen-xsd/simpleallele.xsd

— Reply to this email directly or view it on GitHub https://github.com/clingen-data-model/clingen-data-model/issues/119#issuecomment-114927332 .

larrybabb commented 9 years ago

@tnavatar can you verify how often the site is built? I think the xsd file that @ronakypatel is pulling down may be from the final build not the latest commits which don't seem to be available in the latest build.

Actually, @srynobio I do not see "other" in the simpleallele.xsd that @ronakypatel references above. So, if you pushed the changes, it is not clear that you pushed them in the correct artifact. @srynobio, please verify.

srynobio commented 9 years ago

I have added custom the the name type list.

larrybabb commented 9 years ago

got it. my misunderstanding.

tnavatar commented 9 years ago

Right now the site is built on-demand—I’ll do it when I update something or someone else asks me to. Also the instructions are in the Readme, I know at least Chris and Bradford have done builds and pushed them up.

I’d like to have them done automatically via a CI process, but I haven’t found a way to do that that works with GitHub pages that doesn’t involve putting a private SSH key somewhere public, which is a nogo.

That said, I could probably as easily write a git hook to do it when a new commit on master is pushed upstream. Let me know if this is useful and I’ll assign myself a ticket to do it.

On Jun 24, 2015, at 4:51 PM, Larry Babb notifications@github.com wrote:

@tnavatar can you verify how often the site is built? I think the xsd file that @ronakypatel is pulling down may be from the final build not the latest commits which don't seem to be available in the latest build.

— Reply to this email directly or view it on GitHub.