error in NFCORE_CRISPRSEQ:CRISPRSEQ_SCREENING:MAGECK_FLUTEMLE file is not in PNG format

bolenala commented 5 months ago

Description of the bug

The process NFCORE_CRISPRSEQ:CRISPRSEQ_SCREENING:MAGECK_FLUTEMLE starts the FluteMLE command using the data of the gene_summary.txt file. It downloads several pathway files until pathway 'hsa05213' were it stops due to the error shown below.

The error did not occur when running only the 2.2.0/templates/template_fluteMLE.R script on my computer using the gene_summary.txt file. I also tried to use the latest image of mageckflute in the 2.2.0/modules/local/mageck/flutemle.nf with the same output.

Command used and terminal output

nextflow run 2.2.0 -params-file params.yaml -profile normal,singularity -resume -with-report -with-trace -with-timeline > log.txt

params.yaml
input: 'sample_sheet.csv'
outdir: 'results'
analysis: 'screening'
library: 'sgRNA_library.txt'
mle_design_matrix: 'design_matrices/T0_vs_T1_or_T2.txt' 
crisprcleanr: 'library_targets.csv'
min_reads: 3

error message:
  'select()' returned 1:1 mapping between keys and columns
  Info: Working in directory /dnext/project/78/a60432643daa4e4bad087f59d46236
  Info: Writing image file hsa05213.pathview.multi.png
  Error in png::readPNG(figure, native = FALSE) : file is not in PNG format
  Calls: FluteMLE ... arrangePathview -> lapply -> FUN -> <Anonymous> -> <Anonymous>
  In addition: There were 24 warnings (use warnings() to see them)
  Execution halted

Relevant files

No response

System information

Nextflow version: 23.04.0 Hardware: HPC Executor: slurm Container engine: Singularity Version of nf-core/crisprseq: 2.2.0

LaurenceKuhl commented 5 months ago

Hi @bolenala, would you mind giving me your mle output and count table so i can also try locally to see if i can reproduce locally + on the pipeline? I wonder if this issue happens if you run it with conda or docker?

bolenala commented 4 months ago

Hi @LaurenceKuhl, thank you for your response. Unfortunately, I cannot share the output with you as this is unpublished data. Do you have example data I can use to run the pipeline? I used the docker container in the pipeline on the cluster. But I also run the R script on my computer after installing the packages, so without a conda or docker. And then it worked.

LaurenceKuhl commented 4 months ago

hi @bolenala thanks a lot for reporting this bug, we have now have fixed and it running it in the pipeline should now work.

bolenala commented 3 months ago

Hi @LaurenceKuhl, thank you so much! It works now. I have another issue, which I solved, but wanted to share the solution with you. In the process CRISPRCLEANR_NORMALIZE I use a custom library file which results in running the code in the else loop. The column name of the count_file is "Gene" and not "gene" resulting in an error. I just copied the lines of the if loop into the else loop resulting in:

else {
        try(library <- read.delim('${library_file}',header=T,sep = ","))
        duplicates <- duplicated(library[, 1])
        unique_rows <- !duplicates
        library <- library[unique_rows, , drop = FALSE]
        rownames(library) = library[,1]
        library = library[order(rownames(library)),]
        library = library[,-1]
        names(count_file)[names(count_file) == 'Gene'] <- 'gene'
        count_file_to_normalize <- count_file %>% dplyr::select(sgRNA, gene, everything())
        }

and then it worked without errors.

LaurenceKuhl commented 3 months ago

Hey closing this issue as it's now merged in the dev version :)

nf-core / crisprseq