kcleal / dysgu

Toolkit for calling structural variants using short or long reads
MIT License
96 stars 12 forks source link

Supporting reads #18

Closed RenzoTale88 closed 2 years ago

RenzoTale88 commented 2 years ago

Hello, I was wondering if it would be possible to add an option to list the supporting reads for each SV? Even if only for the pacbio/ont reads, I think it would be useful to have as an information. Thank you in advance, Andrea

kcleal commented 2 years ago

Hi, There are a few metrics in the output that should be sufficient: SU column in the INFO field is the 'support' for the event, a combination of PE, SR and WR WR is the read count for number of 'within-read' SVs SR is the number of split-reads PE is the number of paired-end discordant reads

Hope that helps

RenzoTale88 commented 2 years ago

Hi @kcleal thank you for your reply. Unfortunately, I actually need something in the line of the RNAMES field generated by callers such as Sniffles and/or cuteSV. I would understand a "no, we won't implement it" if too much a bother :)

kcleal commented 2 years ago

I see, sorry I misunderstood, I can see how this would be useful. I think this could be incorporated as an option in a future release. Would you prefer the RNAME's to be dumped into the output vcf, or perhaps to a seperate csv file in the temporary directory?

RenzoTale88 commented 2 years ago

That would be great thanks! I think in the VCF would probably be more convenient to access, though I guess it can inflate the size. Both ways I think it would be fine, as long as the two can be unequivocally merged in a second moment. Thanks again!


From: Kez Cleal @.> Sent: Thursday, December 30, 2021 12:42:39 PM To: kcleal/dysgu @.> Cc: RenzoTale88 @.>; Author @.> Subject: Re: [kcleal/dysgu] Supporting reads (Issue #18)

I see, sorry I misunderstood, I can see how this would be useful. I think this could be incorporated as an option in a future release. Would you prefer the RNAME's to be dumped into the output vcf, or perhaps to a seperate csv file in the temporary directory?

— Reply to this email directly, view it on GitHubhttps://github.com/kcleal/dysgu/issues/18#issuecomment-1003012931, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFRTPKB6LJRFIARPLWFSIB3UTRHT7ANCNFSM5K36ZJAA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you authored the thread.Message ID: @.***>

kcleal commented 2 years ago

Hi, I've had a think about this feature and ran some quick tests, and I think it will be more difficult to implement efficiently than I first thought. The main problem is that paired-end reads end up consuming huge amounts of memory due to having to store all the query names of the reads. In the initial stages of analysis this can be quite problematic. I think I will put on hold trying to implement this for now.

KristinaGagalova commented 2 years ago

I think this option is very handy, just wondering if that can easily done for LR