yezhengSTAT / CUTTag_tutorial

Tutorial Website
https://yezhengstat.github.io/CUTTag_tutorial/
48 stars 16 forks source link

What does "25x25 PE Illumina sequencing" means and how does the reads length influence CUT&Tag? #3

Open dyinboisry4u opened 3 years ago

dyinboisry4u commented 3 years ago

Hi, I found a description in the tutorial: "Our standard pipeline is to perform single-index 25x25 PE Illumina sequencing on up to 90 pooled samples on a single HiSeq 2500 flowcell, where each sample has a unique PCR primer barcode." I don't konw what does "25x25 PE Illumina sequencing" means? And how does the reads length influence CUT&Tag? Thanks!

dyinboisry4u commented 3 years ago

I think it means 25 + 25(2*25bp), is it right?

yezhengSTAT commented 3 years ago

25x25 means 25bp per sequencing read end. In other words, the data are paired-end and each end has 25bp.

As for the selection of 25bp per end and the influence on CUT&Tag, here is an answer from Dr. Steve Henikoff: Many genomics resource facilities do longer reads by default, such as PE150 because most of their customers want the sequence itself, whereas all we need for this protocol is to efficiently and reliably map the fragments, and PE25 is more than good enough for that. You're paying for the extra sequencing both directly and indirectly because of the much larger files that need to be handled, archived and/or submitted to a repository.

dyinboisry4u commented 3 years ago

Thanks for your reply, In my opinion, in contrast to ChIP-Seq(sonication based) randomness , the library size of CUT&Tag(Tagmentation based) should have a "nucleosome units distribution". Could the PE25 reads cover all liabrary locations? (such as DNA wrapped around the middle position of the nucleosome). 1 On the other,does a very short read lead to problems with mapping?If I choose PE150 and use bowtie2 to alignment,could you give me more advice on bowtie2 parameters.(I found "--local --very-sensitive --no-mixed --no-discordant --phred33 -I 10 -X 700" for longer reads in tutorial, but adapter sequences has been removed, I can't understand why use local alignment).

yezhengSTAT commented 3 years ago

25bp per end does not mean the fragment length is 50bp. Please refer to section 3.4 for the discussion on fragment length distribution.

When the --local option is specified, Bowtie 2 performs local read alignment. In this mode, Bowtie 2 might "trim" or "clip" some read characters from one or both ends of the alignment if doing so maximizes the alignment score. Therefore --local is less stringent than --end-to-end and will do soft-trimming for you if there is any adapter left on the sequences. General advice for longer reads is that you can try end-to-end and if the alignment rate is not acceptable, try the global setting. Or you can also try the ChIP-seq alignment parameters that you usually use and see which alignment rate looks more reasonable.

Thanks, Ye

dyinboisry4u commented 3 years ago

25bp per end does not mean the fragment length is 50bp. Please refer to section 3.4 for the discussion on fragment length distribution.

When the --local option is specified, Bowtie 2 performs local read alignment. In this mode, Bowtie 2 might "trim" or "clip" some read characters from one or both ends of the alignment if doing so maximizes the alignment score. Therefore --local is less stringent than --end-to-end and will do soft-trimming for you if there is any adapter left on the sequences. General advice for longer reads is that you can try end-to-end and if the alignment rate is not acceptable, try the global setting. Or you can also try the ChIP-seq alignment parameters that you usually use and see which alignment rate looks more reasonable.

Thanks, Ye

Sorry for my expression, I mean PE25 read could not enough to cover the full length fragment. As shown in the following figure, few reads are covered within the ribosome. 2