neurogenomics / CUT_n_TAG

Preprocessing pipeline for CUT&TAG data.
MIT License
0 stars 0 forks source link
benchmarking encode epigenomics histone-modifications

CUT_n_TAG

Pre-/post-processing pipelines and results for CUT&TAG data generated by the Neurogenomics Lab @ Imperial College London.

Results

Links to all results. Each subheader is the unique ID of a given sequencing batch, assigned by the Imperial BRC Genomics Facility.

EpiCompare

Native ChIP-seq (GSE66023) vs. ENCODE ChIP-seq (ENCSR000AKP-ENCFF038DDS)

CUT&Tag/CUT&Run/TIP-seq vs. ENCODE

Brian's reports
Sera's reports

HK5M2BBXY

Description: Initial test run of four samples (two H3k27ac + two H3k27ame3). Accidentally merged libraries across assay types (H3k27ac/H3k27ame3) during nf-core/atacseq run (will fix).

MultiQC report

ataqv report

NF execution report

NF execution timeline

All samples

Bulk analysis

Single cell H3K27ac

Downloading new samples

When BRC sends you an email letting you know they've finished sequencing your samples, follow these steps to download and prepare the data.

Note: File and folder names are just used as examples here. You'll need to adapt these to match your particular file/folder names.

  1. Log onto HPC.
  2. If you haven't done so already, set up your irods credentials (instructions here). You only need to do this once.
  3. Move into the folder where you want to store your data.
  4. Download the data with irods:
    module load irods/4.2.0
    iget -Pr /igfZone/home/di.hu/IGFQ001187_hu_10-5-2021_scCutandTag/fastq/2021-05-11/HL25VBBXY
    cd HL25VBBXY
  5. Unpack each .tar file:
    tar -xvf IGFQ001187_hu_10-5-2021_scCutandTag_4_16_2021-05-11.tar 
    tar -xvf IGFQ001187_hu_10-5-2021_scCutandTag_6_16_2021-05-11.tar 
  6. Remove the old files (once you're sure the previous step worked):
    rm IGFQ001187_hu_10-5-2021_scCutandTag_4_16_2021-05-11.tar
    rm IGFQ001187_hu_10-5-2021_scCutandTag_6_16_2021-05-11.tar 
  7. Optional: Change permissions recursively so that other members of your team can access and manipulate the files. Make sure to adapt the scope of the permissions however is appropriate for your case.
    chmod -R u=rwx,go=rx ../HL25VBBXY/

Pipelines

nf-core/atacseq

1. Setup containers on HPC

mkdir -p /rds/general/user/$USER/ephemeral/tmp/  
mkdir -p /rds/general/user/bms20/ephemeral/rtmp/ 

2. Download nf-core/atacseq container

Now you can download the nfcore/atacseq singularity container via DockerHub

3. Prepare nextflow config file

The config file tells nextflow how to run on Imperial's HPC.

4. Optional: Register Nexflow Tower

5. Download the singularity container

6. Finally run the pipeline!

nextflow run nf-core/atacseq --input raw_data/HK5M2BBXY/design.csv --genome GRCh37 -r 1.2.1 -profile /rds/general/user/$USER/projects/neurogenomics-lab/live/.singularity-cache/atacseq_latest.sif

henipipe

CUTTAG_tutorial

CUT&RUNTools

Documentation

Exercepts from the full BRC Genome help page

File name

Illumina uses the following file name convention for the output fastq files

For example: samplename_S1_L001_R1_001.fastq.gz

Please check the Illumina BCL2Fastq documentation for more information.