Open sreichl opened 1 year ago
ATAC-seq: Nextera adapter explanation by FD
Let's just focus on the color code since the orientation of the pieces is quite confusing. I assume you know the steps of adding Nextera adapter with transposase followed by a PCR to align and amplify the adapter with barcode and sequencing primer information.
The Nextera sequence for trimming is given as*:
Nextera_transposase_adapter_trimming
CTGTCTCTTATACACATCT
Nextera_transposase_adapter_trimming_reverse_complement
AGATGTGTATAAGAGACAG
What confused me was that the trimming sequence (gray) is only a substring (underlined) of the adapter sequence for PCR amplification and indexing. I was looking for the whole trimming sequence and could not find it in the adapter:
Adapter_sequence
CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGGAGATGT
Index 1 PCR primer read
Index (variable)
Transposase adapter specific (Part 1)
Transposase adapter specific (Part 2)
The missing piece of information was that the Nextera transposase adapter for trimming is not the complete transposable sequence that gets aligned by the Tn5 transposase. The complete sequence looks like this:
Nextera_transposase_adapter
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG
Nextera_transposase_adapter_reverse_complement
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG
In conclusion, the Nextera trimming sequence is only a substring of the whole transposable DNA element which gets aligned to the DNA fragments by transposase. What I don't understand is why Illumina only uses the substring for trimming.
*https://support-docs.illumina.com/SHARE/AdapterSeq/illumina-adapter-sequences.pdf
PS: Transposases are such an interesting class of enzymes. One of my favorite papers during my Master's was about viral transposable elements in the human genome which get shuffled around when the epigenetic marks for repression are lifted during embryogenesis (https://www.nature.com/articles/nrg2072) .