shendurelab / MPRAflow

A portable, flexible, parallelized tool for complete processing of massively parallel reporter assay data
Apache License 2.0
31 stars 16 forks source link

Issue with barcode association in MPRAflow #79

Open tsadlon opened 1 year ago

tsadlon commented 1 year ago

Hello

I'm having issues with barcode association pipeline in MPRA flow. The key the key error message in the log is:

46199 Floating point exception(core dumped) bwa index -a bwtsw design_rmIllegalChars.fa.

The full error log is (MPRAflow) ubuntu@ip-10-0-1-119:~/repos/MPRAflow$ nextflow run association.nf --fastq-insert "Undetermined_S0_R1_001.fastq.gz" --fastq-insertPE "Undetermined_S0_R2_001.fastq.gz" --design "Tconv_design_edit.fa" --fastq-bc "Undetermined_S0_I1_001.fastq.gz" --name "test"

N E X T F L O W ~ version 20.01.0 Launching association.nf [spontaneous_engelbart] - revision: 087b40d39f

                                      ,--./,-.
      ___     __   __   __   ___     /,-._.--~'
|\ | |__  __ /  ` /  \ |__) |__         }  {
| \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                      `._,._,'

MPRAflow v2.3.5"

Pipeline Name : MPRAflow Pipeline Version: 2.3.5 Fastq insert : /home/ubuntu/repos/MPRAflow/Undetermined_S0_R1_001.fastq.gz fastq paired : /home/ubuntu/repos/MPRAflow/Undetermined_S0_R2_001.fastq.gz Fastq barcode : /home/ubuntu/repos/MPRAflow/Undetermined_S0_I1_001.fastq.gz design fasta : /home/ubuntu/repos/MPRAflow/Tconv_design_edit.fa minimum BC cov : 3 map quality : 30 base quality : 30 cigar string : n min % mapped : 0.5 Output dir : outs Run name : test Working dir : /home/ubuntu/repos/MPRAflow/work Container Engine: null Current home : /home/ubuntu Current user : ubuntu Current path : /home/ubuntu/repos/MPRAflow base directory : /home/ubuntu/repos/MPRAflow Script dir : /home/ubuntu/repos/MPRAflow Config Profile : standard

==================================================== Nextflow version 20.10 required! You are running v20.01.0. Pipeline execution will continue, but things may break. Please run nextflow self-update to update Nextflow.

executor > local (1) executor > local (8) [19/cd3696] process > count_bc_nolab [100%] 1 of 1 ✔ [19/1f5c3c] process > create_BWA_ref [ 0%] 0 of 1 [da/b1aacc] process > PE_merge [ 17%] 1 of 6 [- ] process > align_BWA_PE - [- ] process > collect_chunks - [- ] process > map_element_barcodes - [- ] process > filter_barcodes - Error executing process > 'create_BWA_ref (make ref)'

Caused by: Missing output file(s) design_rmIllegalChars.fa.fai expected by process create_BWA_ref (make ref)

Command executed:

!/bin/bash

bwa index -a bwtsw design_rmIllegalChars.fa samtools faidx design_rmIllegalChars.fa picard CreateSequenceDictionary REFERENCE=design_rmIllegalChars.fa OUTPUT=design_rmIllegalChars.fa".dict"

Command exit status: 0

Command output: (empty)

Command error: [bwa_index] Pack FASTA... 0.01 sec [bwa_index] Construct BWT for the packed sequence... executor > local (8) [19/cd3696] process > count_bc_nolab [100%] 1 of 1 ✔ [19/1f5c3c] process > create_BWA_ref [100%] 1 of 1, failed: 1 ✘ [a3/a91f31] process > PE_merge [100%] 1 of 1 [- ] process > align_BWA_PE - [- ] process > collect_chunks - [- ] process > map_element_barcodes - [- ] process > filter_barcodes - WARN: Killing pending tasks (5) Error executing process > 'create_BWA_ref (make ref)'

Caused by: Missing output file(s) design_rmIllegalChars.fa.fai expected by process create_BWA_ref (make ref)

Command executed:

!/bin/bash

bwa index -a bwtsw design_rmIllegalChars.fa samtools faidx design_rmIllegalChars.fa picard CreateSequenceDictionary REFERENCE=design_rmIllegalChars.fa OUTPUT=design_rmIllegalChars.fa".dict"

Command exit status: 0

Command output: (empty)

Command error: [bwa_index] Pack FASTA... 0.01 sec [bwa_index] Construct BWT for the packed sequence... .command.sh: line 2: 46199 Floating point exception(core dumped) bwa index -a bwtsw design_rmIllegalChars.fa [faidx] Could not build fai index design_rmIllegalChars.fa.fai INFO 2023-10-09 13:01:32 CreateSequenceDictionary

** NOTE: Picard's command line syntax is changing.


** For more information, please see: ** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)


** The command line looks like this in the new syntax:


** CreateSequenceDictionary -REFERENCE design_rmIllegalChars.fa -OUTPUT design_rmIllegalChars.fa.dict


13:01:32.546 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/ubuntu/repos/MPRAflow/work/conda/mpraflow_py36-1978c54da7aacd41df3c7a4cb7639795/share/picard-2.20.8-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so [Mon Oct 09 13:01:32 UTC 2023] CreateSequenceDictionary OUTPUT=design_rmIllegalChars.fa.dict REFERENCE=design_rmIllegalChars.fa TRUNCATE_NAMES_AT_WHITESPACE=true NUM_SEQUENCES=2147483647 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [Mon Oct 09 13:01:32 UTC 2023] Executing as ubuntu@ip-10-0-1-119 on Linux 5.15.0-1022-aws amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.20.8-SNAPSHOT [Mon Oct 09 13:01:32 UTC 2023] picard.sam.CreateSequenceDictionary done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=514850816

Work dir: /home/ubuntu/repos/MPRAflow/work/19/1f5c3cabec72c8e26c068c5d00bed5 Any help would be much appreciated.

visze commented 1 year ago

It seems that the workflow fails to index your fasta reference fasta file. Maybe it is not in a correct format. Try first indexing your fasta file with samtools faidx and see if this runs through.