nf-core / cutandrun

Analysis pipeline for CUT&RUN and CUT&TAG experiments that includes QC, support for spike-ins, IgG controls, peak calling and downstream analysis.
https://nf-co.re/cutandrun
MIT License
79 stars 45 forks source link

Provide relative paths to resources in IGV session file #247

Open siddharthab opened 2 months ago

siddharthab commented 2 months ago

Description of the bug

Currently, the IGV reporting directory tries to keep all the resource files within it to make the directory self-contained. It does so through the includeInputs mechanism, but it is not clear whether they are supposed to be symlinks or copies. If they are copies, then it is duplication without much added benefit. If it is symlinks, then it is not clear if all systems will support symlinks.

For us, when using Google Cloud Batch executor with Fusion enabled, the files seem to be broken. There are other reports of similar behavior and they are claimed to have been fixed, but we don't see the fix in the latest version.

Command used and terminal output

$ nextflow run nf-core/cutandrun # input and output parameters, IGV reporting is kept enabled.

# Inspect output directory.
$ ls -l [REDACTED]/04_reporting/igv/
total 6400
-rw-r--r--  1 sid.bagaria  staff    112 Jul 12 07:45 SAMPLE1_R1.bigWig
-rw-r--r--  1 sid.bagaria  staff    131 Jul 12 07:45 SAMPLE1_R1.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    112 Jul 12 07:45 SAMPLE1_R2.bigWig
-rw-r--r--  1 sid.bagaria  staff    131 Jul 12 07:45 SAMPLE1_R2.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    113 Jul 12 07:45 SAMPLE2_R1.bigWig
-rw-r--r--  1 sid.bagaria  staff    132 Jul 12 07:45 SAMPLE2_R1.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    113 Jul 12 07:45 SAMPLE2_R2.bigWig
-rw-r--r--  1 sid.bagaria  staff    132 Jul 12 07:45 SAMPLE2_R2.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    112 Jul 12 07:45 SAMPLE3_R1.bigWig
-rw-r--r--  1 sid.bagaria  staff    131 Jul 12 07:45 SAMPLE3_R1.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    112 Jul 12 07:45 SAMPLE3_R2.bigWig
-rw-r--r--  1 sid.bagaria  staff    131 Jul 12 07:45 SAMPLE3_R2.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    112 Jul 12 07:45 SAMPLE4_R1.bigWig
-rw-r--r--  1 sid.bagaria  staff    131 Jul 12 07:45 SAMPLE4_R1.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    112 Jul 12 07:45 SAMPLE4_R2.bigWig
-rw-r--r--  1 sid.bagaria  staff    131 Jul 12 07:45 SAMPLE4_R2.seacr.peaks.stringent.bed
-rw-r--r--  1 sid.bagaria  staff    108 Jul 12 07:45 IgG_R1.bigWig
-rw-r--r--  1 sid.bagaria  staff    108 Jul 12 07:45 IgG_R2.bigWig
-rw-r--r--  1 sid.bagaria  staff    664 Jul 12 07:45 exp_files.txt
-rw-r--r--  1 sid.bagaria  staff    128 Jul 12 07:45 gencode.v45.annotation.bed.bed.gz
-rw-r--r--  1 sid.bagaria  staff    132 Jul 12 07:45 gencode.v45.annotation.bed.bed.gz.tbi
-rw-r--r--  1 sid.bagaria  staff      0 Jul 12 07:45 gff.igv.txt
-rw-r--r--  1 sid.bagaria  staff      0 Jul 12 07:45 gtf.igv.txt
-rw-r--r--  1 sid.bagaria  staff    102 Jul 12 07:45 hg38.fa
-rw-r--r--  1 sid.bagaria  staff    106 Jul 12 07:45 hg38.fa.fai
-rw-r--r--  1 sid.bagaria  staff    664 Jul 12 07:45 igv_files.txt
-rw-r--r--  1 sid.bagaria  staff  10470 Jul 12 11:33 igv_session.xml
$ cat [REDACTED]/04_reporting/igv/SAMPLE1.bigWig  
/fusion/gs/[REDACTED]-scratch/nextflow-work/20240712T025447/cb/bdfe3386fc9ea4e223935aa9025a1f/SAMPLE1.bigWig

Relevant files

No response

System information

Nextflow version: 24.04.3 build 5916 Hardware: GCP VM n2-standard-16 Executor: Google Cloud Batch Container Engine: Docker OS: Ubuntu Linux 22.04 Version of nf-core/cutandrun: master (v3.2.2)

siddharthab commented 2 months ago

I can confirm that the logic in the atacseq pipeline works well for us because it uses relative paths instead.

chris-cheshire commented 2 weeks ago

Hi thanks for the update, I will try to impliment the ATAC logic in the next version