shendurelab / MPRAflow

A portable, flexible, parallelized tool for complete processing of massively parallel reporter assay data
Apache License 2.0
31 stars 16 forks source link

Run count.nf:ERROR ~ No such variable: cond #75

Closed huziouziou closed 1 year ago

huziouziou commented 1 year ago

When I am replicating the example on basic count workflow, I get the error below:

ERROR ~ No such variable: cond

-- Check script 'count.nf' at line: 252 or see '.nextflow.log' file for more details

I copy your experiment file into experiment.csv

Condition,Replicate,DNA_BC_F,DNA_UMI,DNA_BC_R,RNA_BC_F,RNA_UMI,RNA_BC_R
HEPG2,1,SRR10800881_1.fastq.gz,SRR10800881_2.fastq.gz,SRR10800881_3.fastq.gz,SRR10800882_1.fastq.gz,SRR10800882_2.fastq.gz,SRR10800882_3.fastq.gz
HEPG2,2,SRR10800883_1.fastq.gz,SRR10800883_2.fastq.gz,SRR10800883_3.fastq.gz,SRR10800884_1.fastq.gz,SRR10800884_2.fastq.gz,SRR10800884_3.fastq.gz
HEPG2,3,SRR10800885_1.fastq.gz,SRR10800885_2.fastq.gz,SRR10800885_3.fastq.gz,SRR10800886_1.fastq.gz,SRR10800886_2.fastq.gz,SRR10800886_3.fastq.gz

I read the count.nf, but I cannot figure out this problem. I create the conda enviroment with conf/mpraflow_py36.yml. I don't know the reason causing this error. Could you help me?

huziouziou commented 1 year ago

command nextflow run ~/tools/MPRAflow/count.nf -w ./work --experiment-file "./data/experiment.csv" --dir "./data/" --outdir "./output" --design "../Assoc_Basic/data/design.fa" --association "../Assoc_Basic/output/assoc_basic/assoc_basic_filtered_coords_to_barcodes.pickle" logs:

N E X T F L O W  ~  version 19.01.0
Launching `/home/hongshuai/tools/MPRAflow/count.nf` [stupefied_mandelbrot] - revision: 720e5db844
=======================================================
                                          ,--./,-.
          ___     __   __   __   ___     /,-._.--~'
    |\ | |__  __ /  ` /  \ |__) |__         }  {
    | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                          `._,._,'
MPRAflow v2.3.5"
=======================================================
Pipeline Name  : shendurelab/MPRAflow
Pipeline Version: 2.3.5
Run Name       : stupefied_mandelbrot
Output dir     : ./output
Working dir    : ***
Current home   : ***
Current user   : ***
Current path   : ***
Script dir     : ***
Config Profile : standard
Experiment File: ***
reads          : DataflowQueue(queue=[])
UMIs           : Reads with UMI
BC length      : 15
BC threshold   : 10
mprAnalyze     : false
=========================================
ERROR ~ ====================================================
  Nextflow version 20.10 required! You are running v19.01.0.
  Pipeline execution will continue, but things may break.
  Please run `nextflow self-update` to update Nextflow.
============================================================

 -- Check '.nextflow.log' file for details
start analysis
ERROR ~ No such variable: cond

 -- Check script 'count.nf' at line: 252 or see '.nextflow.log' file for more details

experiment.csv

huziouziou commented 1 year ago

logs-nextflow=20.10:

N E X T F L O W  ~  version 20.10.0
Launching `/home/hongshuai/tools/MPRAflow/count.nf` [nauseous_fermat] - revision: 720e5db844
=======================================================
                                          ,--./,-.
          ___     __   __   __   ___     /,-._.--~'
    |\ | |__  __ /  ` /  \ |__) |__         }  {
    | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                          `._,._,'
MPRAflow v2.3.5"
=======================================================
Pipeline Name  : shendurelab/MPRAflow
Pipeline Version: 2.3.5
Run Name       : nauseous_fermat
Output dir     : ***
Working dir    : ***
Current home   : ***
Current user   : ***
Current path   : ***
Script dir     : ***
Config Profile : standard
Experiment File: ***
reads          : DataflowQueue(queue=[])
UMIs           : Reads with UMI
BC length      : 15
BC threshold   : 10
mprAnalyze     : false
=========================================
start analysis
[-        ] process > create_BAM           -
[-        ] process > raw_counts           -
[-        ] process > filter_counts        -
[-        ] process > final_counts         -
[-        ] process > dna_rna_merge_counts -
[-        ] process > dna_rna_merge        -
[-        ] process > calc_correlations    -
[-        ] process > make_master_tables   -
Error executing process > 'create_BAM (make idx)'

Caused by:
  Process `create_BAM` input file name collision -- There are multiple input files for each of the following file names: null

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
huziouziou commented 1 year ago

I have resolved this issue by making modifications to the code of count.nf as shown below:

// Create FASTQ channels
if (params.no_umi) {
  reads_noUMI = Channel.fromPath(params.experiment_file).splitCsv(header: true).flatMap{
    row -> [
      tuple(
        cond=row.Condition,
        rep=row.Replicate,
        type="DNA",
        datasetID=[row.Condition,row.Replicate,"DNA"].join("_"),
        fw_fastq=file([params.dir,"/",row.DNA_BC_F].join()),
        rev_fastq=file([params.dir,"/",row.DNA_BC_R].join())
      ),
      tuple(
        cond=row.Condition,
        rep=row.Replicate,
        type="RNA",
        datasetID=[row.Condition,row.Replicate,"RNA"].join("_"),
        fw_fastq=file([params.dir,"/",row.RNA_BC_F].join()),
        rev_fastq=file([params.dir,"/",row.RNA_BC_R].join())
      )
    ]
  }
} else {
  reads = Channel.fromPath(params.experiment_file).splitCsv(header: true).flatMap{
    row -> [
      tuple(
        cond=row.Condition, 
        rep=row.Replicate, 
        type="DNA",
        datasetID=[row.Condition,row.Replicate,"DNA"].join("_"),
        fw_fastq=file([params.dir,"/",row.DNA_BC_F].join()),
        umi_fastq=file([params.dir,"/",row.DNA_UMI].join()),
        rev_fastq=file([params.dir,"/",row.DNA_BC_R].join()),
      ),
      tuple(
        cond=row.Condition, 
        rep=row.Replicate, 
        type="RNA",
        datasetID=[row.Condition,row.Replicate,"RNA"].join("_"),
        fw_fastq=file([params.dir,"/",row.RNA_BC_F].join()),
        umi_fastq=file([params.dir,"/",row.RNA_UMI].join()),
        rev_fastq=file([params.dir,"/",row.RNA_BC_R].join()),
      ),
    ]
  }
}