Open charlesfoster opened 2 years ago
I ended up writing a simple bash script to add 'fake' genotype information to a VCF file, e.g. specifying that my sample is from a virus --> GT=1. This is good enough for my purposes. I can provide the script if it will be of use for anyone else, in the absence of a more robust alternative.
Hi, I am little late for the party. I am having the same problem. My sample is also virus. Can you please share the script if you still have it. It can save my day. Another rquestion is, did you merge the multiple samples?
Hi @arunvv90, Here's the script (with a .txt suffix so I can attach it):
I run lofreq
on individual samples, then run the attached script before potentially merging multiple samples.
Thanks a lot man. I was keep on trying different things. Really appreciate your quick response. Does this script add the sample name field in the header, which is required for merging the multiple samples ?
Yep, the sample name is added to the header. Previously the sample name was only guessed from the infile name, but I just added another flag to allow you to explicitly specify the name. New script attached. add_artificial_genotype.txt
The raw vcf:
Modified vcf after running script:
Wow! Lightening speed!! I just tested the script and it work like charm! I was about to use bcftools reheader to change the file name. Let me test the new script for custom sample name
I just tested the sample name feature also. It worked perfectly. Simple & easy solution!!! Thank you very much av724@bioram /s/a/v/s/s/l/test> bash add_artificial_genotype.sh -i BCAHV_vibi_indelq_alnq_call.vcf.gz -g 1/1 -n test_samplename -o out3.vcf.gz VCF with artificial genotype written to out3.vcf.gz av724@bioram /s/a/v/s/s/l/test> ls (npsm) add_artificial_genotype.sh* BCAHV_vibi_indelq_alnq_call.vcf.gz.tbi out3.vcf.gz.tbi BCAHV_vibi_indelq_alnq_call.vcf.gz out3.vcf.gz av724@bioram /s/a/v/s/s/l/test> bcftools query -l out3.vcf.gz (npsm) test_samplename
Hi, I'd like to use
lofreq
for my pipeline, but I require genotype information for downstream commands. I tried usinglofreq2_add_sample.py
like so:However, I get the following error:
If I change the read mode for files in the
add_plp_to_vcf
function from 'rb' to 'r' to try and get around this error, I get:Do you have a workaround? My python version is 3.8.6. Thanks.