epigen / scifiRNA-seq

GNU General Public License v3.0
11 stars 5 forks source link

Input BAM description #7

Closed Hofphi closed 3 years ago

Hofphi commented 3 years ago

Hey Andre, thank you for making the processing pipeline available! This is very helpful.

I have a question regarding the tags for the input BAM. The length for the BC-tag that you specify in the manual is 22bp. However, any BC constellation I can imagine is not matching your description.

1) i7-sample-BC (8bp) + r1-scifi-BC (13bp) -> in total 21bp 2) r2-10x-BC (16bp) + r1-scifi-BC (13bp) -> in total 29bp

Could you clarify which constellation you are referring to? Are the BC's concatenated with an underscore or without any separation?

Should the final paired-end BAM contain 10x-BCs as read1 and the matching cDNA insert (78 bp) as read2?

From looking at your code I couldn't quite figure out where you handle the r2-barcode. Hence my question, do I have to include a r2-tag for the 10xBC or is this handled by the pipeline?

I would very much appreciate your help.

Best, Philipp

afrendeiro commented 3 years ago

Hi,

I've added a little guide for how we demultiplex scifi-RNA-seq data (written by Daniele): https://github.com/epigen/scifiRNA-seq/blob/main/demultiplexing_guide.pdf (also plain text here). Let me know if that helps.

Hofphi commented 3 years ago

Thank you very much for uploading the demultiplexing-guide. This helped a lot!

Maybe you can add a quick link to this guide in the README-file of this repo so that everyone can access this quickly without going through the Issue-page.

afrendeiro commented 3 years ago

Happy that it helped. Yes, thank you I meant to do that but forgot.