Open korikuzma opened 7 months ago
May be the name of a specific drug, e.g. ruxolitinib, or a drug class, e.g. JAK inhibitors. Multiple terms are allowed only to represent combination therapies, e.g. cisplatin and vinorelbine for non-small cell lung cancer. Separate multiple terms with a semi-colon."
For drugForTherapeuticAssertion
in clinicalImpactClassification
: it looks like you can only represent a single therapy or combination therapy. In GKS we have a TherapeuticSubstituteGroup, which we use in CIViC.
May be the name of a specific drug, e.g. ruxolitinib, or a drug class, e.g. JAK inhibitors. Multiple terms are allowed only to represent combination therapies, e.g. cisplatin and vinorelbine for non-small cell lung cancer. Separate multiple terms with a semi-colon."
For
drugForTherapeuticAssertion
inclinicalImpactClassification
: it looks like you can only represent a single therapy or combination therapy. In GKS we have a TherapeuticSubstituteGroup, which we use in CIViC.
On a related note, you are only required to provide the name of the drug. However, in other places such as providing disease/gene it allows you to provide database identifiers (array of items) OR name as a string. I'm curious why they didn't follow a similar approach here.
May be the name of a specific drug, e.g. ruxolitinib, or a drug class, e.g. JAK inhibitors. Multiple terms are allowed only to represent combination therapies, e.g. cisplatin and vinorelbine for non-small cell lung cancer. Separate multiple terms with a semi-colon."
https://github.com/ncbi/clinvar/blob/master/submission_api_schema/submission_apitest_schema.json#L530-L533 For
drugForTherapeuticAssertion
inclinicalImpactClassification
: it looks like you can only represent a single therapy or combination therapy. In GKS we have a TherapeuticSubstituteGroup, which we use in CIViC.On a related note, you are only required to provide the name of the drug. However, in other places such as providing disease/gene it allows you to provide database identifiers (array of items) OR name as a string. I'm curious why they didn't follow a similar approach here.
That's odd that they didn't apply a consistent approach everywhere.
It also appears that the spreadsheets allow for more info to be submitted. For instance, in the spreadsheets there's a Variant - more
section with information such as Variation identifiers
, Alternate designations
, and URL
but I cannot seem to find this information in the apitest jsonschema.
I cannot seem to find this information in the apitest jsonschema.
Is this because "Apparently, the NCBI ClinVar server has no tight coupling to the JSON schemas and these schemas are mostly for informative purposes."?
I cannot seem to find this information in the apitest jsonschema.
Is this because "Apparently, the NCBI ClinVar server has no tight coupling to the JSON schemas and these schemas are mostly for informative purposes."?
Ugh
I cannot seem to find this information in the apitest jsonschema.
Is this because "Apparently, the NCBI ClinVar server has no tight coupling to the JSON schemas and these schemas are mostly for informative purposes."?
Ugh
I was just hoping I was blind
May be the name of a specific drug, e.g. ruxolitinib, or a drug class, e.g. JAK inhibitors. Multiple terms are allowed only to represent combination therapies, e.g. cisplatin and vinorelbine for non-small cell lung cancer. Separate multiple terms with a semi-colon."
https://github.com/ncbi/clinvar/blob/master/submission_api_schema/submission_apitest_schema.json#L530-L533 For
drugForTherapeuticAssertion
inclinicalImpactClassification
: it looks like you can only represent a single therapy or combination therapy. In GKS we have a TherapeuticSubstituteGroup, which we use in CIViC.On a related note, you are only required to provide the name of the drug. However, in other places such as providing disease/gene it allows you to provide database identifiers (array of items) OR name as a string. I'm curious why they didn't follow a similar approach here.
We should compile a list of questions like this and send along to our ClinVar contacts for clarification.
@ahwagner sounds good. Wanted to run this by y'all before doing so at least
I don't think there's a way to submit our evidence / evidence line data (using ClinGen/CGC/VICC SOP onco codes) at the moment. I'm only seeing a citation field and it's pretty limited.
@ahwagner had said "The VCI folks have ended up encoding pathogenicity codes as free-text string comments"
By the way, I have not been testing the API directly yet. I was manually creating submission requests for a CIViC EID and VarCat assertion for test fixtures. I sent a request for a service account this morning. Once approved, we'll get our API key and we will be able to test the submission API.
I don't think there's a way to submit our evidence / evidence line data (using ClinGen/CGC/VICC SOP onco codes) at the moment. I'm only seeing a citation field and it's pretty limited.
@ahwagner had said "The VCI folks have ended up encoding pathogenicity codes as free-text string comments"
To be clear, the VCI folks flatten their data structure to accommodate the ClinVar model in this way; they natively encode a rich ev/prov structure similar to VarCat. Here's an example record from ClinGen/VCI in ClinVar:
May be the name of a specific drug, e.g. ruxolitinib, or a drug class, e.g. JAK inhibitors. Multiple terms are allowed only to represent combination therapies, e.g. cisplatin and vinorelbine for non-small cell lung cancer. Separate multiple terms with a semi-colon."
https://github.com/ncbi/clinvar/blob/master/submission_api_schema/submission_apitest_schema.json#L530-L533 For
drugForTherapeuticAssertion
inclinicalImpactClassification
: it looks like you can only represent a single therapy or combination therapy. In GKS we have a TherapeuticSubstituteGroup, which we use in CIViC.On a related note, you are only required to provide the name of the drug. However, in other places such as providing disease/gene it allows you to provide database identifiers (array of items) OR name as a string. I'm curious why they didn't follow a similar approach here.
How the CIViC team handles this: "We just turn substitutes into separate AIDs"
If time allows, we should try sending malformed data to see what failed validation looks like.
As I have been performing dry runs on the test API, I've been testing out validation both on accident and on purpose. Here are some examples of output (I haven't been saving the malformed input data):
{'message': 'Validation failed, see errors for detailed description', 'errors': [{'message': "5233 is not of type 'string'", 'code': None, 'identifier': None}]}
{'message': 'Validation failed, see errors for detailed description', 'errors': [{'message': "{'db': 'PubMed', 'id': '25265492'} is not of type 'array'", 'code': None, 'identifier': None}, {'message': "{'gene': [{'id': 2778}]} is not valid under any of the given schemas", 'code': None, 'identifier': None}, {'message': "'Tier I - strong' is not one of ['Tier I - Strong', 'Tier II - Potential', 'Tier III - Unknown', 'Tier IV - Benign/Likely benign']", 'code': None, 'identifier': None}]}
We also want to check that the data appears in the test instance as expected following submission. @wesleygoar to review first and once approved, we'll send along to @ahwagner for secondary review
@ahwagner since the test instance doesn't appear to let you see the data, I just added you as a reviewer for #8 and #9 (@wesleygoar approved both) to review the data we would submit. Once approved, I can submit to the test instance
GitHub repo: https://github.com/ncbi/clinvar/tree/master/submission_api_schema Test endpoint: https://submit.ncbi.nlm.nih.gov/apitest/v1/submissions Submission API documentation: https://www.ncbi.nlm.nih.gov/clinvar/docs/api_http/
@ahwagner said:
Additional notes: