vtsyvina / CliqueSNV

MIT License
21 stars 5 forks source link

suggested addon #7

Closed antoine4ucsd closed 3 years ago

antoine4ucsd commented 3 years ago

amazing job! one nice improvement to the output would be to include the number of reads used for each haplotype and put it in the sequence name. is it something you can get ? a summary table with statistics would also be cool (total reads , retained, Freq, etc)

congrats again. the algorithm gives me really interesting results so far.

gallardo-seq commented 3 years ago

I second @antoine4ucsd 's comment. Having a CSV file that lists the read names that were use to generate a particular clique would be a really useful feature. Maybe the program is already generating an intermediate file like this, but is not retained? Great algorithm, very user friendly and results are very accurate with our control datasets!

vtsyvina commented 3 years ago

Hello.

I implemented simple version of what you asked in the latest version. Use "-rn" option. It will create the list with read names related to each haplotype. But note that the same read can go to different haplotypes (if they are on the same distance) or not go anywhere at all.

Also, for Illumina data the algorithm has pretty complex workflow with several stages when it aggregates the results with different thresholds, so it is not trivial to say what reads depends to what haplotypes. But still the tool will output read list as best as it can=)

For summary table I guess you never know what another person will need, but if you have specific suggestions I might try to implement them

antoine4ucsd commented 3 years ago

thank you so much! I will test it asap and let you know how it works for me. do you include the num_read in the seq name?

antoine4ucsd commented 3 years ago

Hello still working with your algorithm. really interesting results so far. I can't find a way to set a minimum coverage . is it possible to set a min coverage AND a min frequency (and include the number of reads for each haplotype in the sequence name besides its frequency). Also, cliqueSNV seems to freeze for some samples (probably due to memory issue and/or problematic samples). would you have any suggestions in such situation? I can share a sample in mp if you have time.

Best.

vtsyvina commented 3 years ago

Coverage doesn't make much sense here. We add reads to one or another haplotype based on SNPs. If some region is not covered by SNPs then reads from that region will be ignored because there is no evidence to decide from which haplotype they came from.

We have minimum support for SNPs parameter(-t) but it works locally for pair of correlated SNPs.

You can send me the sample, I will have a look

antoine4ucsd commented 3 years ago

Thank you! Can you give me a mp and I will send you a link Best

-- a

From: Viachaslau Tsyvina notifications@github.com Reply-To: vtsyvina/CliqueSNV reply@reply.github.com Date: Saturday, September 19, 2020 at 8:58 AM To: vtsyvina/CliqueSNV CliqueSNV@noreply.github.com Cc: Antoine Chaillon antoine.chaillon@gmail.com, Mention mention@noreply.github.com Subject: Re: [vtsyvina/CliqueSNV] suggested addon (#7)

Coverage doesn't make much sense here. We add reads to one or another haplotype based on SNPs. If some region is not covered by SNPs then reads from that region will be ignored because there is no evidence to decide from which haplotype they came from.

We have minimum support for SNPs parameter(-t) but it works locally for pair of correlated SNPs.

You can send me the sample, I will have a look

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/vtsyvina/CliqueSNV/issues/7#issuecomment-695301231, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AENFHZY2A2JAK6VKYZT5YNTSGTIKJANCNFSM4RK5Q4BQ.

vtsyvina commented 3 years ago

mp? Not sure what is it

antoine4ucsd commented 3 years ago

Sorry I meant email

-- a

From: Viachaslau Tsyvina notifications@github.com Reply-To: vtsyvina/CliqueSNV reply@reply.github.com Date: Saturday, September 19, 2020 at 9:18 AM To: vtsyvina/CliqueSNV CliqueSNV@noreply.github.com Cc: Antoine Chaillon antoine.chaillon@gmail.com, Mention mention@noreply.github.com Subject: Re: [vtsyvina/CliqueSNV] suggested addon (#7)

mp? Not sure what is it

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/vtsyvina/CliqueSNV/issues/7#issuecomment-695325424, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AENFHZZRID4NKMUR2LM6DZTSGTKVLANCNFSM4RK5Q4BQ.

vtsyvina commented 3 years ago

v.tsyvina@gmail.com