bcgsc / straglr

Tandem repeat expansion detection or genotyping from long-read alignments
Other
50 stars 9 forks source link

Could Gender support and VCF output be added? #32

Closed bartcharbon closed 1 month ago

bartcharbon commented 4 months ago

Dear @readmanchiu,

We are using Straglr in our pipeline (https://github.com/molgenis/vip/) to call STR's on nanopore data. Since all the modules in our pipeline are build to have VCF input and VCF output we ended up using the fork from @philres. We also know the sex of our samples and this fork seems to have added value in that regard.

However, we also would like to offer the genome scan function to our users (currently we only support the genotyping with a loci bed file), and I cannot get that to work with installing the python egg from this fork. (but using the original repo leaves me without VCF output) Furthermore it seems to be some commits behind the main branch of this original repo.

So we are looking for the best of both worlds (forks), and my questions are: Would you consider adding the VCF format as output in the future Would you consider adding the sample sex as an input parameter and use it in the STR calling on the X chromosome?

readmanchiu commented 4 months ago

Hi @bartcharbon,

Thanks for using Straglr in your pipeline. Yes I have been planning to produce VCF output for Straglr. If you can provide me with a sample VCF output (better yet with the matched TSV from Straglr), that will be a tremendous help for me to decide what to put in the vcf and what not. You send it to my email rchiu@bccgsc.ca if that's more convenient. I can get that going ASAP. As for the sample sex parameter, I assume that would be informative of inferring the zygosity of the chrX calls. Is there any other added values that I am missing?

bartcharbon commented 4 months ago

Dear @readmanchiu,

That great to hear! I'll look up/generate some public data to send you, and get back to you by email (probably next week)

Sample sex parameter is indeed used for the zygozity of the X calls. (we've been using this fork: https://github.com/philres/straglr, I think the sex is used in the variant.py file)

ilivyatan commented 3 months ago

Hi, I'd just like to comment that we are also using strglr with nanopore data. We like to use it standalone since there is more flexibility with parameter settings, but are also in need of a VCF output. The nanopore epi2me wf-human-variation pipeline incorporates strglr with sex information and produces a VCF. Maybe you can look there and see how it's done (https://github.com/epi2me-labs/wf-human-variation).

readmanchiu commented 2 months ago

v1.5.0 has been released to output VCF and --sex has been added

https://github.com/bcgsc/straglr/blob/master/docs/vcf.md

bartcharbon commented 1 month ago

Thanks for implementing this, we're going to give it a try!