Open NMaziak opened 1 year ago
Hi Noura,
Thank you for your question. We do not filter out reads less than 1kbp in insert size. There are two reasons that we do not do this.
You've touched on one of the most considerable challenges in managing the change from a restriction enzyme to a sequence-independent fragmentation. Much of the classical nomenclature around restriction enzyme-based Hi-C read classification doesn't apply to MNase-based Hi-C. Still, we have a large customer base that is used to and expects this terminology, such as "valid" or "invalid" Hi-C reads. As such, we informed people how many topological informative reads there are by using a standard threshold of 1kb.
Hopefully that is helpful, Cory
--
Cory Padilla, Ph.D.
Product Manager | Cantata Bio o: 831-233-3779 | c: (650) 438-6910
[cid:5fe1a170-aa7b-4dec-a19f-b9aa4db0205c]https://www.cantatabio.com
From: NMaziak @.> Sent: Tuesday, March 7, 2023 5:26 AM To: dovetail-genomics/Micro-C @.> Cc: Subscribed @.***> Subject: [dovetail-genomics/Micro-C] Question on filtering of cis-reads with insert sizes less than 1kb (Issue #3)
**CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hello there, I'm new to using pairtools and have some Micro-C's to analyse and found your walkthrough (thanks, it's been very helpful!)
I had a question, and it might be that I keep missing it in the pairtools documentation, but I was wondering where in the pipeline is filtering of cis-interactions with insert size less than 1 kb happening (as its mentioned in the qc plot linked below)? https://micro-c.readthedocs.io/en/latest/library_qc.htmlhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmicro-c.readthedocs.io%2Fen%2Flatest%2Flibrary_qc.html&data=05%7C01%7CCPadilla%40cantatabio.com%7C2e873fac60784eb9b98408db1f0f9db4%7Cb9b65b4abe6c4b83959e981062890f8e%7C1%7C0%7C638137924153886918%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CK9ouoW0PBKsxfljlyIRbP8OcWvPeRCcTOs75Ot4uJQ%3D&reserved=0
Any clarification is much appreciated, thanks again! Noura
— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdovetail-genomics%2FMicro-C%2Fissues%2F3&data=05%7C01%7CCPadilla%40cantatabio.com%7C2e873fac60784eb9b98408db1f0f9db4%7Cb9b65b4abe6c4b83959e981062890f8e%7C1%7C0%7C638137924153886918%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=kemqTREbcplBz6SF07elwGeLPYgMTcDmhqqijH40JnQ%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAIKN5E2O24SQG6762XJ444TW24ZRZANCNFSM6AAAAAAVSPEDZQ&data=05%7C01%7CCPadilla%40cantatabio.com%7C2e873fac60784eb9b98408db1f0f9db4%7Cb9b65b4abe6c4b83959e981062890f8e%7C1%7C0%7C638137924153886918%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sHt9nxzkhx7Xlz776iXhUoz4Gc8nrHzTzpimYNo9OxQ%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi Cory,
Very sorry for the delay, but thank you for such an informative answer! I'm testing out some of the new features in pairtools and consequently am revisiting this. Just to make sure then, the only filtration you are doing is deduplicating and removing pairs which are separated by less than 30 bp, is that correct?
All the best, Noura
Hi Noura - Yup - you got it!
--
Cory Padilla, Ph.D.
Product Manager | Cantata Bio o: 831-233-3779 | c: (650) 438-6910
[cid:22095e63-fd07-49fe-8f2a-6cd4afabba7b]https://www.cantatabio.com/
From: Noura @.> Sent: Wednesday, November 8, 2023 3:52 AM To: dovetail-genomics/Micro-C @.> Cc: Cory Padilla @.>; Comment @.> Subject: Re: [dovetail-genomics/Micro-C] Question on filtering of cis-reads with insert sizes less than 1kb (Issue #3)
**CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi Cory,
Very sorry for the delay, but thank you for such an informative answer! I'm testing out some of the new features in pairtools and consequently am revisiting this. Just to make sure then, the only filtration you are doing is deduplicating and removing pairs which are separated by less than 30 bp, is that correct?
All the best, Noura
— Reply to this email directly, view it on GitHubhttps://github.com/dovetail-genomics/Micro-C/issues/3#issuecomment-1801735447, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIKN5E2PDMZI43Y2TIBQM3TYDNXATAVCNFSM6AAAAAAVSPEDZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBRG4ZTKNBUG4. You are receiving this because you commented.Message ID: @.***>
Hello there, I'm new to using pairtools and have some Micro-C's to analyse and found your walkthrough (thanks, it's been very helpful!)
I had a question, and it might be that I keep missing it in the pairtools documentation, but I was wondering where in the pipeline is filtering of cis-interactions with insert size less than 1 kb happening (as its mentioned in the qc plot linked below)? https://micro-c.readthedocs.io/en/latest/library_qc.html
Any clarification is much appreciated, thanks again! Noura