ENCODE-DCC / chip-seq-pipeline2

ENCODE ChIP-seq pipeline
MIT License
241 stars 123 forks source link

Advice on output folders/files #273

Closed gene-drive closed 2 years ago

gene-drive commented 2 years ago

I've successfully ran the ChIP-seq pipeline and need advice interpreting the folders/files. The size of my output folder is 176 GB and I'm trying to determine which files to use for downstream analyses (peak, motif analysis, etc.).

There are dozens of folders containing many more subfolders and files so I'm not sure which of these I should keep and which I can trash.

idr output

I was able to successfully find the qc.html file but I believe I will at minimum need the conservative and optimal peakset bed files and bigwig files.

leepc12 commented 2 years ago

Please pip install croo and then run croo /PATH_OR_URI/TO/YOUR/metadata.json on a directory where you want to organize all outputs. croo will soft-link all outputs on the directory and make a HTML with file table describing all outputs.