Closed twang15 closed 3 years ago
Transcription factors are proteins that bind to regulatory regions in the genome. TFs are regulatory elements.
What is the key difference between DNase-seq and ChIP-seq?
ChIP-seq identifies the location in the genome bound by proteins
Advantage:
Disadvantage:
DNase-Seq is one of the several approaches in molecular biology useful to identify DNA response elements, or regulatory regions in general, through genome-wide sequencing of regions sensitive to cleavage by DNase I.
A brief outline of the technique is the following:
Pros
Cons
What is the difference between ChIP-seq and DNase-seq?
In ChIP-Seq, you first isolate chromatin but then you use an antibody to immunoprecipitate a specific factor in the chromatin, it could be a histone mark, or a transcription factor, for example. The DNA that was bound to the factor gets then sequenced and you can find out which genomic regions were bound by the factor at the moment of chromatin isolation.
DNAse-Seq is used to find areas of open chromatin, which are accessible to DNAse I digestion, without necessarily know what was bound to the open chromatin in terms of transcription factors, etc.
A new improved method to look at open chromatin is called ATACSeq (http://www.nature.com/nmeth/journal/v10/n12/full/nmeth.2688.html) which takes advantage of an engineered transposase that carries the sequencing adapters necessary to firm a library for next generation sequencing, and that can insert them only in exposes DNA, in other words, areas of open chromatin. This method is faster, you can start with fewer cells, and it seems to be less noisy than DNAse-Seq. We are using this method in my lab right now and having fun with it.
Of course, one can correlate ChIPSeq with DNAse or ATACSeq, as there are certain histone marks that correlate with actively transcribed or open chromatin.
position frequency matrices, describing transcription factor motifs,
Motif Logos”, a graphical representation of TF binding affinity (ie, of the PWMs)
Hidden Markov Model
single nucleotide polymorphisms (SNPs)
read alignments (BAM files), genomic profiles (wig/bigWig files) and genomic regions (bed, vcf files); RGT include classes for handling genome annotations, such as transcript and gene from standard formats (gtf files) and motif databases (transfac format)
Transcription factors
HINT method description, 2014 paper low-level analysis: reads alignment with Bowtie2 and peaks calling with MACS2 transcriptome/exome differential analysis tag, tag count footprint
// install other genome reference
cd ~/rgtdata python setupGenomicData.py --mm9
rgt-hint footprinting --atac-seq --paired-end --organism=mm9 --output-location=./ --output-prefix=cDC1 cDC1.bam cDC1_peaks.narrowPeak
rgt-hint footprinting --atac-seq --paired-end --organism=mm9 --output-location=./ --output-prefix=pDC pDC.bam pDC_peaks.narrowPeak
rgt-hint tracks --bc --bigWig --organism=mm9 cDC1.bam cDC1_peaks.narrowPeak --output-prefix=cDC1_BC rgt-hint tracks --bc --bigWig --organism=mm9 pDC.bam pDC_peaks.narrowPeak --output-prefix=pDC_BC
rgt-hint tracks --bc --bigWig --organism=mm9 cDC1.bam cDC1_peaks.narrowPeak --output-prefix=cDC1_BC rgt-hint tracks --bc --bigWig --organism=mm9 pDC.bam pDC_peaks.narrowPeak --output-prefix=pDC_BC
We observe that this gene has several open chromatin regions for these two cell types, but one particular region has cDC1 specific footprints.
Summary of Analysis Flow
Tao: footprinting sounds like unique signature of a single cell on open chromatin regions. Would it be interesting to identify
Approach: We can do this by first finding motifs overlapping with predicted footprints
Meeting note on Feb 26. 2021