Closed Don-Isdale closed 3 days ago
This facility is tested using the prototype 'Genotype Search' panel / dialog, which provides parameters : dataset Id, Sample names, SNP names.
From the dataset Id the list of non-link .vcf.gz files are requested.
Confirming that the list of .vcf.gz file names excluding soft-links is requested.
childProcess vcfGenotypeLookup.bash 0 false undefined 0 lb3app/scripts /media/don/Linux0/home/don/new/projects/agribio/markerMapViewer/pretzel.A3/lb4app
+ scope=noLinks
+ cd tmp/vcf/201028_40K_DAS5_samples_XT_exomeIDs
+ '[' noLinks = noLinks ']'
::ffff:127.0.0.1 - - [06/Jun/2024:18:26:52 +0000] "GET /api/Datasets/vcfGenotypeFeaturesCountsStatus?id=201028_40K_DAS5_samples_XT_exomeIDs HTTP/1.1" 200 605 "http://localhost:4200/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0"
http://localhost:3000/api/Datasets/vcfGenotypeFeaturesCountsStatus?id=201028_40K_DAS5_samples_XT_exomeIDs
params :
id=201028_40K_DAS5_samples_XT_exomeIDs
(replace-string "\\n" "\n")
{"text":"scope=
3170787 Jun 6 21:15 1A_copy.MAF.vcf.gz
162731 Jun 6 21:15 1A_copy.MAF.vcf.gz.csi
2671431 Aug 9 2022 1A_copy.vcf.gz
165069 Jun 6 21:15 1A_copy.vcf.gz.csi
463961 Jan 29 12:12 1A.MAF.SNPList.vcf.gz
149577 Jan 29 12:12 1A.MAF.SNPList.vcf.gz.csi
3164247 Jan 29 12:10 1A.MAF.vcf.gz
159238 Jan 29 12:10 1A.MAF.vcf.gz.csi
159118 Jan 29 12:07 1A.vcf.gz.csi
463961 Jan 29 16:04 1B.MAF.SNPList.vcf.gz
149577 Jan 29 16:04 1B.MAF.SNPList.vcf.gz.csi
3164248 Jan 29 16:04 1B.MAF.vcf.gz
159236 Jan 29 16:04 1B.MAF.vcf.gz.csi
159118 Jan 29 16:04 1B.vcf.gz.csi
"}
In this case there is 1 non-link .vcf.gz file, and this file name is included as parameter in the following request.
Confirming that the correct .vcf.gz is used.
::ffff:127.0.0.1 - - [06/Jun/2024:12:24:49 +0000] "POST /api/Blocks/vcfGenotypeLookupPost HTTP/1.1" 200 - "http://localhost:4200/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0"
The request processing time is 50.705 ms. for /vcfGenotypeLookupPost
vcfGenotypeLookup 201028_40K_DAS5_samples_XT_exomeIDs undefined Numerical 74 [
'query',
'201028_40K_DAS5_samples_XT_exomeIDs',
'1A_copy.vcf.gz',
'',
'',
'',
'',
'-queryStart',
'-H',
'-f',
'%ID\t%POS\t%REF\t%ALT\t%INFO[\t%GT]\n',
'-queryEnd'
]
childProcess vcfGenotypeLookup.bash 0 false undefined 0 lb3app/scripts /media/don/Linux0/home/don/new/projects/agribio/markerMapViewer/pretzel.A3/lb4app
+ bcftoolsCommand query 201028_40K_DAS5_samples_XT_exomeIDs/1A_copy.MAF.vcf.gz '' '' -s ExomeCapture-DAS5-001803,ExomeCapture-DAS5-001365,ExomeCapture-DAS5-002317
+ vcfGz=201028_40K_DAS5_samples_XT_exomeIDs/1A_copy.MAF.vcf.gz
+ echo isecDatasetIdsArray : 0 , vcfGzs 0 , snpNames 1 scaffold38755_1235130 scaffold38755_1337276
+ bcftools query 201028_40K_DAS5_samples_XT_exomeIDs/1A_copy.MAF.vcf.gz -s ExomeCapture-DAS5-001803,ExomeCapture-DAS5-001365,ExomeCapture-DAS5-002317 -H -f '%ID %POS %REF %ALT %INFO[ %GT]
' -i ' ID="scaffold38755_1235130" || ID="scaffold38755_1337276" '
cbWrap null #[1]ID [2]POS [3]REF [4]ALT [5](null) [6]ExomeCapture-DAS5-001803:GT [7]ExomeCapture-DAS5-001365:GT [8]ExomeCapture-DAS5-002317:GT
scaffold38755_1235130 1235130 C T F_MISSING=0.0259067;NS=564;AN=1128; undefined
::ffff:127.0.0.1 - - [06/Jun/2024:12:24:51 +0000] "POST /api/Blocks/vcfGenotypeLookupPost HTTP/1.1" 200 387 "http://localhost:4200/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0"
http://localhost:3000/api/Blocks/vcfGenotypeLookupPost
POST Data
(replace-string "," ",\n")
{"datasetId":"201028_40K_DAS5_samples_XT_exomeIDs",
"preArgs":{
"samples":"ExomeCapture-DAS5-001803\nExomeCapture-DAS5-001365\nExomeCapture-DAS5-002317",
"requestInfo":false,
"requestFormat":"Numerical",
"requestSamplesAll":false,
"snpPolymorphismFilter":false,
"mafThreshold":0,
"mafUpper":false,
"featureCallRateThreshold":0,
"datasetVcfFile":"1A_copy.vcf.gz",
"snpNames":"scaffold38755_1235130\nscaffold38755_1337276"},
"nLines":100,
"options":{}}
{"text":"#[1]ID\t[2]POS\t[3]REF\t[4]ALT\t[5](null)\t[6]ExomeCapture-DAS5-001803:GT\t[7]ExomeCapture-DAS5-001365:GT\t[8]ExomeCapture-DAS5-002317:GT
scaffold38755_1235130\t1235130\tC\tT\tF_MISSING=0.0259067;NS=564;AN=1128;MAF=0.150709;AC=170;AC_Het=12\t0/0\t0/0\t0/0
scaffold38755_1337276\t1337276\tG\tC\tF_MISSING=0.0138169;NS=571;AN=1142;MAF=0.400175;AC=457;AC_Het=1\t0/0\t0/0\t1/1
"}
Part of #383
Observable outcomes :
This enables the User to search for the given SNP names across all chromosomes of the selected dataset, and request the lookup of Genotype values for those SNP names.
Measure with :
Perform the script either from the command line, or by adding parameters to a request sent by the application, and confirm from the trace that the correct lookup command is performed, and that the results are correct, include all chromosomes of the dataset, and apply to the requested SNP names.
Task Sequence :
[x] Implement changes in the lookup request execution, in calling bcftools This is complete and confirmed by the Test plan and execution in following comment below.
[x] Connect that change in the call flow in frontend and backend, i.e. add / change parameters passed This is in progress, as indicated by the completed items in the following design breakdown.
[x] [4-8H] vcfGenotypeLookup : in this use case don't pass scope (chromosome) parameter, i.e. it is an optional parameter of the API endpoint; instead request a list of non-soft-link .vcf.gz files and search those via individual API requests. The datasetId parameter is unchanged - it identifies the directory in which the .vcf.gz files reside.
[x] [2-3H/6H/0H] vcfGenotypeReceiveResult() : dataset param instead of block; determine block from #CHROM column This includes : To enable loading of results from multiple chromosomes :