CCBR / CRISPIN

CRISPR screen pipeline
https://ccbr.github.io/CRISPIN/
MIT License
0 stars 3 forks source link

error finding library file in nf-TRIM_COUNT_MAGECK_COUNT #26

Closed slsevilla closed 11 months ago

slsevilla commented 11 months ago

Description of the bug

Error in trimming. Nextflow can't find the library file : yusa_library.csv. Looks like this is getting passed through correctly as an input but not copying to the working nf directory.

executor >  slurm (4)
[8e/5d02f8] process > INPUT_CHECK:SAMPLESHEET_CHE... [100%] 1 of 1 ✔
[9c/6b4720] process > TRIM_COUNT:CUTADAPT (plasmid)  [100%] 2 of 2 ✔
[49/3ed637] process > TRIM_COUNT:MAGECK_COUNT        [100%] 1 of 1, failed: 1 ✘
[-        ] process > MAGECK:TEST                    -
[-        ] process > DRUGZ                          -
[-        ] process > BAGEL:FOLD_CHANGE              -
[-        ] process > BAGEL:BAYES_FACTOR             -
[-        ] process > BAGEL:PRECISION_RECALL         -
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'TRIM_COUNT:MAGECK_COUNT'

Caused by:
  Process `TRIM_COUNT:MAGECK_COUNT` terminated with an error exit status (1)

Command executed:

  mageck count \
    -l yusa_library.csv \
    --fastq ESC1.trim.fastq.gz plasmid.trim.fastq.gz \
    --sample-label ESC1,plasmid \
    -n test

Command exit status:
  1

Command output:
  (empty)

Command error:
  WARNING: While bind mounting '/gpfs:/gpfs': destination is already in the mount point list
  INFO  @ Mon, 06 Nov 2023 06:36:18: Parameters: /usr/local/bin/mageck count -l yusa_library.csv --fastq ESC1.trim.fastq.gz plasmid.trim.fastq.gz --sample-label ESC1,plasmid -n test 
  INFO  @ Mon, 06 Nov 2023 06:36:18: Welcome to MAGeCK v0.5.9.5. Command: count 
  Traceback (most recent call last):
    File "/usr/local/bin/mageck", line 66, in <module>
      main();
    File "/usr/local/bin/mageck", line 47, in main
      mageckcount_main(args);
    File "/usr/local/lib/python3.9/site-packages/mageck/mageckCount.py", line 544, in mageckcount_main
      genedict=mageckcount_checkargs(args) # return: {sgrnaid:(seq,geneid)} 
    File "/usr/local/lib/python3.9/site-packages/mageck/mageckCount.py", line 76, in mageckcount_checkargs
      genedict=mageckcount_checklists(args) # sgid:(seq,gene)
    File "/usr/local/lib/python3.9/site-packages/mageck/mageckCount.py", line 263, in mageckcount_checklists
      for line in open(args.list_seq):
  FileNotFoundError: [Errno 2] No such file or directory: 'yusa_library.csv'

Work dir:
  /gpfs/gsfs12/users/sevillas2/cruise/test_data/work/49/3ed6379d5a88079c5df8bf633ca485

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Command used and terminal output

cruise run --mode slurm -profile test,biowulf

Relevant files

No response

System information

No response

slsevilla commented 11 months ago

Actually I lied - it isn't being copied:

[sevillas2@cn4287 3ed6379d5a88079c5df8bf633ca485]$ ls -la
total 132
drwxrwx--- 2 sevillas2 CCBR  4096 Nov  6 06:36 .
drwxrwx--- 3 sevillas2 CCBR  4096 Nov  6 06:35 ..
-rw-rw---- 1 sevillas2 CCBR     0 Nov  6 06:36 .command.begin
-rw-rw---- 1 sevillas2 CCBR  1088 Nov  6 06:36 .command.err
-rw-rw---- 1 sevillas2 CCBR  1088 Nov  6 06:36 .command.log
-rw-rw---- 1 sevillas2 CCBR     0 Nov  6 06:36 .command.out
-rw-rw---- 1 sevillas2 CCBR 12205 Nov  6 06:35 .command.run
-rw-rw---- 1 sevillas2 CCBR   160 Nov  6 06:35 .command.sh
-rw-rw---- 1 sevillas2 CCBR     0 Nov  6 06:36 .command.trace
-rw-rw---- 1 sevillas2 CCBR     1 Nov  6 06:36 .exitcode
kelly-sovacool commented 11 months ago

Thanks Sam. Can you include the command line call you used to run CRUISE? Is this in slurm mode?

slsevilla commented 11 months ago

Thanks Sam. Can you include the command line call you used to run CRUISE? Is this in slurm mode?

Yes! added above.

slsevilla commented 11 months ago

Tried running again in the original dir, and again after initializing in a new working dir. The file it's looking for doesn't exist.

[sevillas2@cn4301 test_data2]$ cruise run --mode local -profile test -preview
[2023:11:06 10:31:08] --------------------
[2023:11:06 10:31:08] | Nextflow command |
[2023:11:06 10:31:08] --------------------

nextflow run /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/main.nf -profile biowulf,test -preview 
[+] Loading java 17.0.3.1  ... 
[+] Loading singularity  4.0.1  on cn4301 
[+] Loading nextflow  23.10.0 
N E X T F L O W  ~  version 23.10.0
* PREVIEW * null [angry_lamarr] DSL2 - revision: d94ca9e266
CRUISE 🛳️
=============
NF version   : 23.10.0
runName      : angry_lamarr
username     : sevillas2
configs      : [/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/nextflow.config, /gpfs/gsfs12/users/sevillas2/cruise/test_data2/nextflow.config]
profile      : biowulf,test
cmd line     : nextflow run /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/main.nf -profile biowulf,test -preview
start time   : 2023-11-06T10:31:15.496466221-05:00
projectDir   : /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise
launchDir    : /gpfs/gsfs12/users/sevillas2/cruise/test_data2
workDir      : /gpfs/gsfs12/users/sevillas2/cruise/test_data2/work
homeDir      : /home/sevillas2
input        : assets/samplesheet_test_biowulf.csv

[-        ] process > INPUT_CHECK:SAMPLESHEET_CHECK -
[-        ] process > TRIM_COUNT:CUTADAPT           -
[-        ] process > TRIM_COUNT:MAGECK_COUNT       -
[-        ] process > MAGECK:TEST                   -
[-        ] process > DRUGZ                         -
[-        ] process > BAGEL:FOLD_CHANGE             -
[-        ] process > BAGEL:BAYES_FACTOR            -
[-        ] process > BAGEL:PRECISION_RECALL        -

[sevillas2@cn4301 test_data2]$ cruise run --mode slurm -profile test,biowulf
[2023:11:06 10:31:20] --------------------
[2023:11:06 10:31:20] | Nextflow command |
[2023:11:06 10:31:20] --------------------

nextflow run /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/main.nf -profile biowulf,slurm,test
Traceback (most recent call last):
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/bin/cruise", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/cruise/src/__main__.py", line 119, in main
    cli()
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/cruise/src/__main__.py", line 92, in run
    run_nextflow(
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/cruise/src/util.py", line 190, in run_nextflow
    with open(nek_base(hpc_options[hpc]["slurm"]), "r") as template:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/assets/slurm_header_biowulf.sh'
kelly-sovacool commented 11 months ago

@slsevilla can you try running cruise init again? It should be copying the library files correctly now

edit: you might have to module purge; module load ccbrpipeliner again first

slsevilla commented 11 months ago

No, still same issue.

Did a purge:

[sevillas2@cn4301 test_data2]$ module purge; module load ccbrpipeliner
[-] Unloading ccbrpipeliner  5  Thank you for using ccbrpipeliner! 
[+] Loading ccbrpipeliner  5  ... 
###########################################################################
                                CCBR Pipeliner
###########################################################################
    "ccbrpipeliner" is a suite of end-to-end pipelines and tools
    Visit https://github.com/ccbr for more details.
    Pipelines are available on BIOWULF and FRCE.
    Tools are available on BIOWULF, HELIX and FRCE.

    The following pipelines/tools will be loaded in this module:

    RENEE v2.5 https://ccbr.github.io/RENEE/
    XAVIER v3.0 https://ccbr.github.io/XAVIER/
    CARLISLE v2.4 https://ccbr.github.io/CARLISLE/
    CHAMPAGNE v0.2 https://ccbr.github.io/CHAMPAGNE/
    CRUISE v0.1 https://ccbr.github.io/CRUISE/

    spacesavers2 v0.10 https://ccbr.github.io/spacesavers2/
    permfix v0.6 https://github.com/ccbr/permfix
###########################################################################
Thank you for using CCBR Pipeliner
###########################################################################

Removed previous dir

[sevillas2@cn4301 cruise]$ rm -r test_data2/; mkdir test_data2/; cd test_data2/

Initialize and preview works fine

[sevillas2@cn4301 test_data2]$ cruise init
[2023:11:06 10:39:58] Copying default config files to current working directory
[sevillas2@cn4301 test_data2]$ cruise run --mode local -profile test -preview
[2023:11:06 10:40:07] --------------------
[2023:11:06 10:40:07] | Nextflow command |
[2023:11:06 10:40:07] --------------------

nextflow run /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/main.nf -profile biowulf,test -preview 
[+] Loading java 17.0.3.1  ... 
[+] Loading singularity  4.0.1  on cn4301 
[+] Loading nextflow  23.10.0 
N E X T F L O W  ~  version 23.10.0
* PREVIEW * null [jovial_blackwell] DSL2 - revision: d94ca9e266
CRUISE 🛳️
=============
NF version   : 23.10.0
runName      : jovial_blackwell
username     : sevillas2
configs      : [/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/nextflow.config, /gpfs/gsfs12/users/sevillas2/cruise/test_data2/nextflow.config]
profile      : biowulf,test
cmd line     : nextflow run /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/main.nf -profile biowulf,test -preview
start time   : 2023-11-06T10:40:12.547998975-05:00
projectDir   : /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise
launchDir    : /gpfs/gsfs12/users/sevillas2/cruise/test_data2
workDir      : /gpfs/gsfs12/users/sevillas2/cruise/test_data2/work
homeDir      : /home/sevillas2
input        : assets/samplesheet_test_biowulf.csv

[-        ] process > INPUT_CHECK:SAMPLESHEET_CHECK -
[-        ] process > TRIM_COUNT:CUTADAPT           -
[-        ] process > TRIM_COUNT:MAGECK_COUNT       -
[-        ] process > MAGECK:TEST                   -
[-        ] process > DRUGZ                         -
[-        ] process > BAGEL:FOLD_CHANGE             -
[-        ] process > BAGEL:BAYES_FACTOR            -
[-        ] process > BAGEL:PRECISION_RECALL        -

Failing to submit

[sevillas2@cn4301 test_data2]$ cruise run --mode slurm -profile test,biowulf
[2023:11:06 10:40:22] --------------------
[2023:11:06 10:40:22] | Nextflow command |
[2023:11:06 10:40:22] --------------------

nextflow run /gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/main.nf -profile biowulf,slurm,test
Traceback (most recent call last):
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/bin/cruise", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/cruise/src/__main__.py", line 119, in main
    cli()
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CHAMPAGNE/v0.2/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/cruise/src/__main__.py", line 92, in run
    run_nextflow(
  File "/data/CCBR_Pipeliner/Pipelines/CRUISE/v0.1/cruise/src/util.py", line 190, in run_nextflow
    with open(nek_base(hpc_options[hpc]["slurm"]), "r") as template:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/gpfs/gsfs10/users/CCBR_Pipeliner/Pipelines/CRUISE/.v0.1.1-dev.1/cruise/assets/slurm_header_biowulf.sh'
kelly-sovacool commented 11 months ago

@slsevilla Ok sorry I didn't read your prior comment carefully last time. Now I think I fixed this problem once and for all -- I wasn't recursively copying the python project data files correctly.

slsevilla commented 11 months ago

Slurm job submitted! Will update soon.