plantinformatics / pretzel

Javascript full-stack framework for Big Data visualisation and analysis
GNU General Public License v3.0
43 stars 13 forks source link

Extend VCF Lookup with the addition of parameter : SNP names array #389

Closed Don-Isdale closed 3 months ago

Don-Isdale commented 3 months ago

Part of #383


Observable outcomes :

This enables the User to enter SNP names instead of selecting a chromosome region, and request the lookup of Genotype values for those SNP names.

Measure with :

Perform the script either from the command line, or by adding parameters to a request sent by the application, and confirm from the trace that the correct lookup command is performed, and that the results are correct and apply to the requested SNP names.


Don-Isdale commented 3 months ago

Test

Test Case

This facility is implemented and tested first, to confirm the functionality and performance of the lookup from SNP names to regions, before adding the ability to pass an array of SNP names in the lookup request from the frontend. This will be replaced by actual params from frontend.

The test configuration used the existing request, which requires a region. The region chosen for the request contains the SNP names, so it doesn't impact the test.

Results

Server log extract

Confirming that the correct filter command is used.

+ '[' query = view_query ']'
+ bcftools query 201028_40K_DAS5_samples_XT_exomeIDs/1A.MAF.SNPList.vcf.gz -r 1A:322574-2019320 -S /dev/null -H -f '%ID %POS    %REF    %ALT    %INFO[  %TGT]
' -i ' ID="scaffold38755_1235130" || ID="scaffold38755_1337276" '
dataOutReply lineCount 3 235 0
+ status=0
+ set -x
+ status_0=0
+ '[' -n '' ']'
+ exit 0
child process exited with code 0
cbWrap null #[1]ID  [2]POS  [3]REF  [4]ALT  [5](null)
scaffold38755_1235130   1235130 C   T   F_MISSING=0.0259067;NS=564;AN=1128;MAF=0.150709;AC=170;AC_Het=12
scaffold38755_1337276   1337276 G   C   F_MISSING=0.0138169;NS=571;AN undefined

API Request

http://localhost:3000/api/Blocks/vcfGenotypeLookup?datasetId=201028_40K_DAS5_samples_XT_exomeIDs&scope=1A&preArgs%5Bregion%5D=1A%3A322574-2019320&preArgs%5BrequestInfo%5D=false&preArgs%5BrequestFormat%5D=CATG&nLines=400

API Result

{"text":"#[1]ID\t[2]POS\t[3]REF\t[4]ALT\t[5](null)
scaffold38755_1235130\t1235130\tC\tT\tF_MISSING=0.0259067;NS=564;AN=1128;MAF=0.150709;AC=170;AC_Het=12
scaffold38755_1337276\t1337276\tG\tC\tF_MISSING=0.0138169;NS=571;AN=1142;MAF=0.400175;AC=457;AC_Het=1
"}

Displayed results - just the request SNP Names

Screenshot from 2024-06-05 10-34-38