EGA-archive / beacon2-ri-api

Beacon v2 Reference Implementation (API)
Apache License 2.0
16 stars 38 forks source link

query details: referenceName, maximum number of variants in a beacon, multi variation queries #349

Open albodrug opened 2 months ago

albodrug commented 2 months ago

Hello,

I have several questions regarding the ri, as my queries don't behave as I would expect them to.

  1. Is there a way in beacon2-ri to specify the reference sequence name as specified in the beacon specifications. My queries on g_variations work unless I specify the "referenceName"
    Finds no variation with referenceName specified

(itx)> curl -H 'Content-Type: application/json' -X POST -d '{ "meta": { "apiVersion": "2.0" }, "query": { "requestParameters": { "referenceName" : "1", "alternateBases": "G" , "referenceBases": "A" , "start": [ 16050074 ], "end": [ 16050568 ], "variantType": "SNP" }, "filters": [], "includeResultsetResponses": "HIT", "pagination": { "skip": 0, "limit": 10 }, "testMode": false, "requestedGranularity": "count" } }' http://localhost:5050/api/g_variants {"meta":{"beaconId":"itx.french.beacon","apiVersion":"v2.0.1","returnedGranularity":"count","receivedRequestSummary":{"apiVersion":"2.0","requestedSchemas":[],"filters":[],"requestParameters":{"referenceName":"1","alternateBases":"G","referenceBases":"A","start":[16050074],"end":[16050568],"variantType":"SNP"},"includeResultsetResponses":"HIT","pagination":{"skip":0,"limit":10},"requestedGranularity":"count","testMode":false},"returnedSchemas":[{"entityType":"genomicVariation","schema":"beacon-g_variant-v2.0.0"}]},"responseSummary":{"exists":false,"numTotalResults":0},"beaconHandovers":[{"handoverType":{"id":"CUSTOM:000001","label":"Project description"},"note":"Project description","url":"https://www.nist.gov/programs-projects/genome-bottle"}]}

Finds 6 variations, only difference is no referenceName specified. Variations are in different chromosomes. > (itx)> curl -H 'Content-Type: application/json' -X POST -d '{ > "meta": { > "apiVersion": "2.0" > }, > "query": { > "requestParameters": { > "alternateBases": "G" , > "referenceBases": "A" , > "start": [ 16050074 ], > "end": [ 16050568 ], > "variantType": "SNP" > }, > "filters": [], > "includeResultsetResponses": "HIT", > "pagination": { > "skip": 0, > "limit": 10 > }, > "testMode": false, > "requestedGranularity": "record" > } > }' http://localhost:5050/api/g_variants | python -m json.tool > { > "meta": { > "beaconId": "itx.french.beacon", > "apiVersion": "v2.0.1", > "returnedGranularity": "record", > "receivedRequestSummary": { > "apiVersion": "2.0", > "requestedSchemas": [], > "filters": [], > "requestParameters": { > "alternateBases": "G", > "referenceBases": "A", > "start": [ > 16050074 > ], > "end": [ > 16050568 > ], > "variantType": "SNP" > }, > "includeResultsetResponses": "HIT", > "pagination": { > "skip": 0, > "limit": 10 > }, > "requestedGranularity": "record", > "testMode": false > }, > "returnedSchemas": [ > { > "entityType": "genomicVariation", > "schema": "beacon-g_variant-v2.0.0" > } > ] > }, > "responseSummary": { > "exists": true, > "numTotalResults": 6 > }, > "response": { > "resultSets": [ > { > "id": "1kGP_DATASET_3K", > "setType": "dataset", > "exists": true, > "resultsCount": 6, > "results": [ > { > "_id": "6687c06cb7605e6f09067ef6", > "_info": { > "datasetId": "default_beacon_1", > "vcf2bff": { > "version": "2.0.0", > "cwd": "/home/bodrug-a/Devlopment/beacon2-ri-tools/beacon_172000971995122/vcf", > "ncpuhost": "20", > "filein": "WGS_chr1_to_chr10.norm.ann.dbnsfp.clinvar.cosmic.vcf.gz", > "user": "bodrug-a", > "hostname": "pp-irs1-4071ylt", > "projectDir": "beacon_172000971995122", > "fileout": "genomicVariationsVcf.json.gz" > }, > "genome": "hg38" > }, > "variation": { > "referenceBases": "A", > "variantType": "SNP", > "alternateBases": "G", > "location": { > "sequence_id": "HGVSid:chr2:g.16050460A>G", > "interval": { > "type": "SequenceInterval", > "end": { > "value": 16050460, > "type": "Number" > }, > "start": { > "type": "Number", > "value": 16050459 > } > }, > "type": "SequenceLocation" > } > }, > "molecularAttributes": { > "annotationImpact": [ > "MODIFIER", > "MODIFIER", > "MODIFIER", > "MODIFIER" > ], > "geneIds": [ > "AC010145.1", > "GACAT3", > "GACAT3", > "GACAT3" > ], > "molecularEffects": [ > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "label": "non_coding_transcript_exon_variant", > "id": "ENSGLOSSARY:0000160" > } > ], > "aminoacidChanges": [ > ".", > ".", > ".", > "." > ] > }, > "variantQuality": { > "QUAL": 0, > "FILTER": "PASS" > }, > "identifiers": { > "genomicHGVSId": "chr2:g.16050460A>G" > }, > "_position": { > "startInteger": 16050459, > "endInteger": 16050460, > "assemblyId": "hg38", > "refseqId": "2", > "start": [ > 16050459 > ], > "end": [ > 16050460 > ] > }, > "variantInternalId": "chrchr2_16050460_A_G", > "caseLevelData": [ > { > "biosampleId": "HG03575", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "biosampleId": "HG03576", > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > } > } > ] > }, > { > "_id": "6687c06cb7605e6f09067f11", > "_info": { > "genome": "hg38", > "datasetId": "default_beacon_1", > "vcf2bff": { > "cwd": "/home/bodrug-a/Devlopment/beacon2-ri-tools/beacon_172000971995122/vcf", > "version": "2.0.0", > "projectDir": "beacon_172000971995122", > "fileout": "genomicVariationsVcf.json.gz", > "hostname": "pp-irs1-4071ylt", > "ncpuhost": "20", > "filein": "WGS_chr1_to_chr10.norm.ann.dbnsfp.clinvar.cosmic.vcf.gz", > "user": "bodrug-a" > } > }, > "molecularAttributes": { > "aminoacidChanges": [ > "", > ".", > ".", > "." > ], > "molecularEffects": [ > { > "id": "ENSGLOSSARY:0000164", > "label": "upstream_gene_variant" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "id": "ENSGLOSSARY:0000161", > "label": "intron_variant" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > } > ], > "geneIds": [ > "GACAT3", > "AC010145.1", > "GACAT3", > "GACAT3" > ], > "annotationImpact": [ > "MODIFIER", > "MODIFIER", > "MODIFIER", > "MODIFIER" > ] > }, > "variation": { > "location": { > "interval": { > "type": "SequenceInterval", > "end": { > "value": 16050396, > "type": "Number" > }, > "start": { > "type": "Number", > "value": 16050395 > } > }, > "type": "SequenceLocation", > "sequence_id": "HGVSid:chr2:g.16050396A>G" > }, > "alternateBases": "G", > "referenceBases": "A", > "variantType": "SNP" > }, > "identifiers": { > "genomicHGVSId": "chr2:g.16050396A>G" > }, > "variantQuality": { > "FILTER": "PASS", > "QUAL": 0 > }, > "caseLevelData": [ > { > "biosampleId": "HG00103", > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > } > }, > { > "biosampleId": "NA20899", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA20905" > }, > { > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "1|0" > }, > "biosampleId": "NA20906" > }, > { > "biosampleId": "NA21088", > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > } > }, > { > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA21102" > }, > { > "biosampleId": "NA21111", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA21116" > }, > { > "biosampleId": "NA21118", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA21124" > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA21126" > }, > { > "biosampleId": "NA21133", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > } > ], > "_position": { > "start": [ > 16050395 > ], > "end": [ > 16050396 > ], > "refseqId": "2", > "assemblyId": "hg38", > "endInteger": 16050396, > "startInteger": 16050395 > }, > "variantInternalId": "chrchr2_16050396_A_G" > }, > { > "_id": "6687b6f0b7605e6f09aeb455", > "variantInternalId": "chrchr1_16050251_A_G", > "_position": { > "assemblyId": "hg38", > "endInteger": 16050251, > "startInteger": 16050250, > "start": [ > 16050250 > ], > "end": [ > 16050251 > ], > "refseqId": "1" > }, > "caseLevelData": [ > { > "biosampleId": "HG00252", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > } > ], > "_info": { > "genome": "hg38", > "datasetId": "default_beacon_1", > "vcf2bff": { > "version": "2.0.0", > "cwd": "/home/bodrug-a/Devlopment/beacon2-ri-tools/beacon_172000971995122/vcf", > "user": "bodrug-a", > "filein": "WGS_chr1_to_chr10.norm.ann.dbnsfp.clinvar.cosmic.vcf.gz", > "ncpuhost": "20", > "hostname": "pp-irs1-4071ylt", > "fileout": "genomicVariationsVcf.json.gz", > "projectDir": "beacon_172000971995122" > } > }, > "variation": { > "referenceBases": "A", > "variantType": "SNP", > "location": { > "interval": { > "type": "SequenceInterval", > "end": { > "type": "Number", > "value": 16050251 > }, > "start": { > "type": "Number", > "value": 16050250 > } > }, > "type": "SequenceLocation", > "sequence_id": "HGVSid:chr1:g.16050251A>G" > }, > "alternateBases": "G" > }, > "molecularAttributes": { > "molecularEffects": [ > { > "label": "upstream_gene_variant", > "id": "ENSGLOSSARY:0000164" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "id": "ENSGLOSSARY:0000161", > "label": "intron_variant" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > } > ], > "aminoacidChanges": [ > "", > ".", > ".", > "." > ], > "annotationImpact": [ > "MODIFIER", > "MODIFIER", > "MODIFIER", > "MODIFIER" > ], > "geneIds": [ > "CLCNKB", > "CLCNKB", > "CLCNKB", > "CLCNKB" > ] > }, > "variantQuality": { > "FILTER": "PASS", > "QUAL": 0 > }, > "identifiers": { > "genomicHGVSId": "chr1:g.16050251A>G" > } > }, > { > "_id": "6687b6f0b7605e6f09aeb474", > "_position": { > "startInteger": 16050223, > "assemblyId": "hg38", > "endInteger": 16050224, > "refseqId": "1", > "start": [ > 16050223 > ], > "end": [ > 16050224 > ] > }, > "variantInternalId": "chrchr1_16050224_A_G", > "caseLevelData": [ > { > "biosampleId": "HG00096", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "biosampleId": "HG00097", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > }, > "biosampleId": "HG00099" > }, > { > "biosampleId": "HG00100", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > }, > "biosampleId": "HG00101" > }, > { > "biosampleId": "HG00561", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "biosampleId": "HG00565", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "biosampleId": "HG00566", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > }, > "biosampleId": "HG00567" > }, > { > "biosampleId": "HG00577", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "biosampleId": "HG02120", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > }, > "biosampleId": "HG02121" > }, > { > "biosampleId": "HG02122", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "biosampleId": "NA12718", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "biosampleId": "NA12739", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "biosampleId": "NA19068", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > }, > "biosampleId": "NA19070" > }, > { > "biosampleId": "NA19072", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > }, > { > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > }, > "biosampleId": "NA19074" > }, > { > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > }, > "biosampleId": "NA19902" > }, > { > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA19904" > }, > { > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA19908" > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA19909" > }, > { > "biosampleId": "NA19913", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > }, > "biosampleId": "NA19914" > }, > { > "biosampleId": "NA21141", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > }, > "biosampleId": "NA21142" > }, > { > "biosampleId": "NA21143", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > }, > { > "biosampleId": "NA21144", > "zygosity": { > "label": "1|1", > "id": "GENO:GENO_0000136" > } > } > ], > "variantLevelData": { > "clinicalInterpretations": [ > { > "category": { > "id": "MONDO:0000001", > "label": "disease or disorder" > }, > "clinicalRelevance": "benign", > "effect": { > "label": "not_provided", > "id": "MedGen:CN517202" > }, > "annotatedWith": { > "toolName": "SnpEff", > "toolReferences": { > "bio.toolsId": "https://bio.tools/snpeff", > "databases": { > "COSMIC": { > "version": "COSMICv92", > "url": "https://cosmic-blog.sanger.ac.uk/cosmic-release-v92" > }, > "ClinVar": { > "version": "20211218", > "url": "https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/archive_2.0/2021" > }, > "dbSNSFP": { > "url": "https://sites.google.com/site/jpopgen/dbNSFP", > "version": "dbNSFP4.1a" > } > }, > "url": "https://pcingola.github.io/SnpEff" > }, > "version": "5.0" > }, > "conditionId": "not_provided" > } > ] > }, > "_info": { > "genome": "hg38", > "vcf2bff": { > "filein": "WGS_chr1_to_chr10.norm.ann.dbnsfp.clinvar.cosmic.vcf.gz", > "user": "bodrug-a", > "ncpuhost": "20", > "hostname": "pp-irs1-4071ylt", > "fileout": "genomicVariationsVcf.json.gz", > "projectDir": "beacon_172000971995122", > "version": "2.0.0", > "cwd": "/home/bodrug-a/Devlopment/beacon2-ri-tools/beacon_172000971995122/vcf" > }, > "datasetId": "default_beacon_1" > }, > "variation": { > "referenceBases": "A", > "variantType": "SNP", > "alternateBases": "G", > "location": { > "sequence_id": "HGVSid:NC_000001.11:g.16050224A>G", > "type": "SequenceLocation", > "interval": { > "end": { > "value": 16050224, > "type": "Number" > }, > "start": { > "type": "Number", > "value": 16050223 > }, > "type": "SequenceInterval" > } > } > }, > "molecularAttributes": { > "molecularEffects": [ > { > "id": "ENSGLOSSARY:0000164", > "label": "upstream_gene_variant" > }, > { > "id": "ENSGLOSSARY:0000161", > "label": "intron_variant" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > } > ], > "aminoacidChanges": [ > "", > ".", > ".", > "." > ], > "annotationImpact": [ > "MODIFIER", > "MODIFIER", > "MODIFIER", > "MODIFIER" > ], > "geneIds": [ > "CLCNKB", > "CLCNKB", > "CLCNKB", > "CLCNKB" > ] > }, > "variantQuality": { > "QUAL": 0, > "FILTER": "PASS" > }, > "identifiers": { > "genomicHGVSId": "NC_000001.11:g.16050224A>G" > } > }, > { > "_id": "6687c06cb7605e6f09067eef", > "caseLevelData": [ > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "HG04191" > }, > { > "biosampleId": "HG04192", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "1|0" > } > }, > { > "biosampleId": "HG04193", > "zygosity": { > "id": "GENO:GENO_0000136", > "label": "1|1" > } > } > ], > "_position": { > "refseqId": "2", > "start": [ > 16050186 > ], > "end": [ > 16050187 > ], > "startInteger": 16050186, > "endInteger": 16050187, > "assemblyId": "hg38" > }, > "variantInternalId": "chrchr2_16050187_A_G", > "_info": { > "vcf2bff": { > "version": "2.0.0", > "cwd": "/home/bodrug-a/Devlopment/beacon2-ri-tools/beacon_172000971995122/vcf", > "filein": "WGS_chr1_to_chr10.norm.ann.dbnsfp.clinvar.cosmic.vcf.gz", > "user": "bodrug-a", > "ncpuhost": "20", > "hostname": "pp-irs1-4071ylt", > "projectDir": "beacon_172000971995122", > "fileout": "genomicVariationsVcf.json.gz" > }, > "datasetId": "default_beacon_1", > "genome": "hg38" > }, > "identifiers": { > "genomicHGVSId": "chr2:g.16050187A>G" > }, > "variantQuality": { > "QUAL": 0, > "FILTER": "PASS" > }, > "molecularAttributes": { > "annotationImpact": [ > "MODIFIER", > "MODIFIER", > "MODIFIER", > "MODIFIER" > ], > "geneIds": [ > "GACAT3", > "AC010145.1", > "GACAT3", > "GACAT3" > ], > "molecularEffects": [ > { > "id": "ENSGLOSSARY:0000164", > "label": "upstream_gene_variant" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "id": "ENSGLOSSARY:0000161", > "label": "intron_variant" > }, > { > "id": "ENSGLOSSARY:0000161", > "label": "intron_variant" > } > ], > "aminoacidChanges": [ > "", > ".", > ".", > "." > ] > }, > "variation": { > "variantType": "SNP", > "referenceBases": "A", > "alternateBases": "G", > "location": { > "sequence_id": "HGVSid:chr2:g.16050187A>G", > "interval": { > "type": "SequenceInterval", > "end": { > "type": "Number", > "value": 16050187 > }, > "start": { > "value": 16050186, > "type": "Number" > } > }, > "type": "SequenceLocation" > } > } > }, > { > "_id": "6687b6f0b7605e6f09aeb454", > "variantInternalId": "chrchr1_16050120_A_G", > "_position": { > "endInteger": 16050120, > "assemblyId": "hg38", > "startInteger": 16050119, > "end": [ > 16050120 > ], > "start": [ > 16050119 > ], > "refseqId": "1" > }, > "caseLevelData": [ > { > "biosampleId": "HG01884", > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > } > }, > { > "biosampleId": "HG01889", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > }, > "biosampleId": "HG01956" > }, > { > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > }, > "biosampleId": "HG02281" > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "HG02464" > }, > { > "biosampleId": "HG02973", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "HG02975" > }, > { > "biosampleId": "HG03052", > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > } > }, > { > "biosampleId": "HG03081", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "HG03270" > }, > { > "biosampleId": "HG03272", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "biosampleId": "HG03279", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "biosampleId": "HG03578", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "biosampleId": "NA18871", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "1|0" > }, > "biosampleId": "NA18872" > }, > { > "biosampleId": "NA19025", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA19036" > }, > { > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "1|0" > }, > "biosampleId": "NA19113" > }, > { > "biosampleId": "NA19115", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "1|0" > }, > "biosampleId": "NA19346" > }, > { > "biosampleId": "NA19383", > "zygosity": { > "label": "1|0", > "id": "GENO:GENO_0000458" > } > }, > { > "biosampleId": "NA19395", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "biosampleId": "NA19438", > "zygosity": { > "id": "GENO:GENO_0000458", > "label": "0|1" > } > }, > { > "zygosity": { > "label": "0|1", > "id": "GENO:GENO_0000458" > }, > "biosampleId": "NA19473" > } > ], > "_info": { > "datasetId": "default_beacon_1", > "vcf2bff": { > "cwd": "/home/bodrug-a/Devlopment/beacon2-ri-tools/beacon_172000971995122/vcf", > "version": "2.0.0", > "fileout": "genomicVariationsVcf.json.gz", > "projectDir": "beacon_172000971995122", > "hostname": "pp-irs1-4071ylt", > "ncpuhost": "20", > "filein": "WGS_chr1_to_chr10.norm.ann.dbnsfp.clinvar.cosmic.vcf.gz", > "user": "bodrug-a" > }, > "genome": "hg38" > }, > "variation": { > "alternateBases": "G", > "location": { > "sequence_id": "HGVSid:chr1:g.16050120A>G", > "interval": { > "start": { > "value": 16050119, > "type": "Number" > }, > "end": { > "type": "Number", > "value": 16050120 > }, > "type": "SequenceInterval" > }, > "type": "SequenceLocation" > }, > "variantType": "SNP", > "referenceBases": "A" > }, > "molecularAttributes": { > "molecularEffects": [ > { > "id": "ENSGLOSSARY:0000164", > "label": "upstream_gene_variant" > }, > { > "id": "ENSGLOSSARY:0000161", > "label": "intron_variant" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > }, > { > "label": "intron_variant", > "id": "ENSGLOSSARY:0000161" > } > ], > "aminoacidChanges": [ > "", > ".", > ".", > "." > ], > "annotationImpact": [ > "MODIFIER", > "MODIFIER", > "MODIFIER", > "MODIFIER" > ], > "geneIds": [ > "CLCNKB", > "CLCNKB", > "CLCNKB", > "CLCNKB" > ] > }, > "variantQuality": { > "QUAL": 0, > "FILTER": "PASS" > }, > "identifiers": { > "genomicHGVSId": "chr1:g.16050120A>G" > } > } > ], > "resultsHandover": { > "handoverType": { > "id": "CUSTOM:000001", > "label": "Project description" > }, > "note": "Project description", > "url": "https://www.nist.gov/programs-projects/genome-bottle" > } > } > ] > }, > "beaconHandovers": [ > { > "handoverType": { > "id": "CUSTOM:000001", > "label": "Project description" > }, > "note": "Project description", > "url": "https://www.nist.gov/programs-projects/genome-bottle" > } > ] > }
  1. Are there any limitations on the number of variations the beacon2-ri-tools or ri-api can load? I have 70M variant positions in my dataset, all my 70M positions have an alternate allele in at least one biosample. Once converted to bff using ri-tools, i am left with only 10M variants. I am using the v1 of ri-tools because I cannot make ri-tools-v2 work.

  2. Finally, my last question: is there a way in beacon to query for individuals or biosamples that have several specified variations. So, not specifying one single position or one single range for variations, but two or more. For example, with several queries:

curl \ -H 'Content-Type: application/json' \ -X POST \ -d '{ "meta": { "apiVersion": "2.0" }, "query": { "requestParameters": { "alternateBases": "G" , "referenceBases": "A" , "start": [ 16050074 ], "end": [ 16050568 ], "variantType": "SNP" }, "query": { "requestParameters": { "alternateBases": "G" , "referenceBases": "A" , "start": [ 50074 ], "end": [ 150568 ], "variantType": "SNP" }, "filters": [], "includeResultsetResponses": "HIT", "pagination": { "skip": 0, "limit": 10 }, "testMode": false, "requestedGranularity": "record" } }' \ http://localhost:5050/api/g_variants

Thanks for any insights or tips! Have a nice day, Alex

costero-e commented 2 months ago

Hi @albodrug , thanks for reporting this. Let's go with the questions:

  1. In beacon2-ri-api, you can specify the referenceName as the specifications, with two possible ways of doing it: g_variants?referenceName=22 or g_variants?referenceName=NC_000022.11 In any case, you need to have the identifiers.genomicHGVSId correctly filled in.
  2. I didn't understand what exactly is your issue, but beacon ri tools v2 doesn't have any limit on the number of variants to transform neither does the API in the number of variants you can load. If the issue is with tools v1, we are not mantaining that software anymore (at least in EGA). If you tell me why you can't make tools v2 work I can help you (start an issue there).
  3. In the specifications, right now, this is not possible. There are range queries or bracket queries, but if you want two specific range queries or bracket queries at the same time, you will have to do them one by one. Once said that, in beacon2 RI API I have this implemented (out of the spec) by sending an array of request parameters, like this: curl \ -H 'Content-Type: application/json' \ -X POST \ -d '{ "meta": { "apiVersion": "2.0" }, "query": { "requestParameters": [{ "alternateBases": "G" , "referenceBases": "A" , "start": [ 16050074 ], "end": [ 16050568 ], "variantType": "SNP"},{ "alternateBases": "G" , "referenceBases": "A" , "start": [ 50074 ], "end": [ 150568 ], "variantType": "SNP"}], "includeResultsetResponses": "HIT", "pagination": { "skip": 0, "limit": 10 }, "testMode": false, "requestedGranularity": "record" } }' \ http://localhost:5050/api/individuals

Best,

Oriol