Illumina / Nirvana

The nimble & robust variant annotator
https://illumina.github.io/NirvanaDocumentation/
GNU General Public License v3.0
170 stars 44 forks source link

Please update schema version #39

Closed hakuliu closed 3 years ago

hakuliu commented 3 years ago

The JSON output from 3.11.1 is inconsistent with the latest documentation of the schema. I've also annotated the same file with 2.0.9, which is consistent with the documentation to show the key differences...at least the ones I was looking at:

dbsnp

Looks like these were moved back to the previous format of array strings. (consistent with version 5) 3.11.1 "dbsnp": ["rs370723703"],

2.0.9 "dbsnp": { "ids": ["rs370723703"] }

transcripts

Seems that these were changed to raw array of transcript objects instead of an object wrapper.

3.11.1

"transcripts": [{
        "transcript": "ENST00000442987.3",
        "source": "Ensembl",
        "bioType": "polymorphic_pseudogene",
        "geneId": "ENSG00000233750",
        "hgnc": "CICP27",
        "consequence": ["downstream_gene_variant"],
        "isCanonical": true
    }, {
        "transcript": "NR_039983.2",
        "source": "RefSeq",
        "bioType": "rRNA_pseudogene",
        "cdnaPos": "1034",
        "exons": "3/3",
        "geneId": "729737",
        "hgnc": "LOC729737",
        "consequence": ["non_coding_transcript_exon_variant"],
        "hgvsc": "NR_039983.2:n.1034T>C",
        "isCanonical": true
    }
]

2.0.9

"transcripts": {
    "refSeq": [{
            "transcript": "NR_039983.2",
            "bioType": "misc_RNA",
            "cdnaPos": "1034",
            "exons": "3/3",
            "geneId": "729737",
            "hgnc": "LOC729737",
            "consequence": ["non_coding_transcript_exon_variant"],
            "hgvsc": "NR_039983.2:n.1034T>C",
            "isCanonical": true
        }, {
            "transcript": "XR_246629.1",
            "bioType": "misc_RNA",
            "geneId": "100996442",
            "hgnc": "LOC100996442",
            "consequence": ["downstream_gene_variant"]
        }
    ],
    "ensembl": [{
            "transcript": "ENST00000442987.3",
            "bioType": "processed_pseudogene",
            "geneId": "ENSG00000233750",
            "hgnc": "CICP27",
            "consequence": ["downstream_gene_variant"],
            "isCanonical": true
        }
    ]
}

ClinVar

"significance" field has changed from single string to string list.

3.11.1

"clinvar": [{
        "id": "RCV000954365.1",
        "variationId": 281760,
        "reviewStatus": "criteria provided, single submitter",
        "alleleOrigins": ["germline"],
        "refAllele": "G",
        "altAllele": "A",
        "phenotypes": ["not provided"],
        "medGenIds": ["CN517202"],
        "significance": ["benign"],
        "lastUpdatedDate": "2019-12-17",
        "pubMedIds": ["28492532"],
        "isAlleleSpecific": true
    }, {
        "id": "RCV000337497.1",
        "variationId": 281760,
        "reviewStatus": "criteria provided, single submitter",
        "alleleOrigins": ["germline"],
        "refAllele": "G",
        "altAllele": "A",
        "phenotypes": ["not specified"],
        "medGenIds": ["CN169374"],
        "significance": ["benign"],
        "lastUpdatedDate": "2019-12-17",
        "isAlleleSpecific": true
    }
]

2.0.9

"clinvar": [{
        "id": "RCV000337497.1",
        "reviewStatus": "criteria provided, single submitter",
        "alleleOrigins": ["germline"],
        "refAllele": "G",
        "altAllele": "A",
        "phenotypes": ["not specified"],
        "medGenIds": ["CN169374"],
        "significance": "benign",
        "lastUpdatedDate": "2020-03-04",
        "isAlleleSpecific": true
    }, {
        "id": "RCV000954365.1",
        "reviewStatus": "criteria provided, single submitter",
        "alleleOrigins": ["germline"],
        "refAllele": "G",
        "altAllele": "A",
        "phenotypes": ["not provided"],
        "medGenIds": ["CN517202"],
        "significance": "benign",
        "lastUpdatedDate": "2020-03-04",
        "pubMedIds": ["28492532"],
        "isAlleleSpecific": true
    }
]
MichaelStromberg commented 3 years ago

Good point. We have created a new documentation site that will address this much better moving forward: https://illumina.github.io/NirvanaDocumentation/

We're still in the process of updating the new site, so not everything is there yet. During the next couple of weeks, we will have most of the schema content back online and now appropriately tagged with the Nirvana versions.