MPUSP / nf-core-crispriscreen

Process next generation sequencing data obtained from CRISPRi repression library screenings
MIT License
4 stars 2 forks source link

prepare_counts.R crashed due to wrong order of colnames #17

Closed ute-hoffmann closed 1 year ago

ute-hoffmann commented 1 year ago

Description of the bug

When running the pipeline, the script "prepare_counts.R" crashed at the "stopifnot" statement on line 79. This statement tests if the design table and the column names of the count matrix share the same entries / names (df_design$sample == colnames(df_combined)[-c(1,2)]). Since I assumed that the exact order of the sample names is not important, I introduced sort() around both statements, i.e. changed the code to stopifnot(all(sort(df_design$sample) == sort(colnames(df_combined)[-c(1,2)]))). This fixed the problem for me.

Command used and terminal output

nextflow run ./ -profile singularity --input "../AminoAcids_CRISPRi/input/samplesheet_CRISPRi_20221202.csv" --fasta "../AminoAcids_CRISPRi/input/Synechocystis_v2_trimmed.fasta" --outdir "../AminoAcids_CRISPRi/results" --three_prime_adapter ^CAGTGATAGAGATACTGGGAGCTA...GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC

Relevant files

No response

System information

Nextflow | 22.10.6 nf-core/crispriscreen | 1.0dev Hardware: desktop OS: Ubuntu

m-jahn commented 1 year ago

Should be easy to fix, yes. The df_design table is a new feature for Mageck, which needs slightly different input created from the sample sheet. The order might play a role, I will check. The df_design table is only required when running Mageck, so DESeq2 analysis is unaffected.