cortes-ciriano-lab / savana

Somatic structural variant caller for long-read data
Apache License 2.0
43 stars 2 forks source link

Query About Training SAVANA with CLR Data for Improved Results #35

Closed LayalYasin closed 4 weeks ago

LayalYasin commented 9 months ago

Hello SAVANA Team,

I am currently using SAVANA for somatic structural variant calling with CLR (Continuous Long Read) data. However, we are not achieving the desired accuracy in our results. I understand that SAVANA allows for the training of a custom model using labeled VCF files.

My question is: Given the unique characteristics of CLR data, would it be a viable approach to train a new model with our CLR data to potentially improve the accuracy of SV calling? Are there any specific considerations or recommendations for this process when using CLR data?

Any insights or suggestions you could provide would be greatly appreciated.

Thank you for your time and assistance.

helrick commented 4 weeks ago

Hi there! Many apologies for my late response - I was working on a new release of SAVANA which should improve default performance on CLR data. However, if you still are seeing performance issues on CLR in v1.2.0, I would indeed recommend training a custom model. Besides the description of how to do this in the README, you can also refer to our preprint on details of how we generated our ONT training set. Happy to help troubleshoot if you run into issues training and using a custom model with CLR data.

All the best, Hillary