Closed SebastienNin closed 3 years ago
Hi Sebastien
The error appears when a reference index is created for BWA. Here the core dumps. See this error line:
.command.sh: line 2: 27236 Floating point exception(core dumped) bwa index -a bwtsw design_rmIllegalChars.fa
Often this is the case when you run out of memory. Can you check how much memory you have available and have assign more?
Best,
Max
SebastienNin @.***> schrieb am Fr., 9. Juli 2021, 09:47:
Hi,
I'm trying to perform the basic association workflow using the tutorial from https://mpraflow.readthedocs.io/en/latest/association_example1.html
I downloaded the fastq file using the sra-toolkit command
Here is my data folder
(MPRAflow)$ ll data/
total 13558560
-rw-rw-r-- 1 sebastiennin tgml 619200 Jul 7 10:56 design.fa
-rw-rw-r-- 1 sebastiennin tgml 192857 Jul 7 10:56 GSM4237954_9MPRA_elements.fa.gz
-rw-rw-r-- 1 sebastiennin tgml 6131739252 Jul 7 20:10 SRR10800986_1.fastq.gz
-rw-rw-r-- 1 sebastiennin tgml 1403565859 Jul 7 20:10 SRR10800986_2.fastq.gz
-rw-rw-r-- 1 sebastiennin tgml 6347816751 Jul 7 20:10 SRR10800986_3.fastq.gz
I got errors when I run the following commands:
cd path/to/MPRAflow
nextflow run association.nf -w /path/to/folder/MPRAflow_tests/Assoc_Basic/work --fastq-insert "/path/to/folder/MPRAflow_tests/Assoc_Basic/data/SRR10800986_1.fastq.gz" --fastq-insertPE "/path/to/folder/MPRAflow_tests/Assoc_Basic/data/SRR10800986_3.fastq.gz" --fastq-bc "/path/to/folder/MPRAflow_tests/Assoc_Basic/data/SRR10800986_2.fastq.gz" --design "/path/to/folder/MPRAflow_tests/Assoc_Basic/data/design.fa" --name assoc_basic
Here is the terminal output
N E X T F L O W ~ version 20.01.0
Launching
association.nf
[elated_engelbart] - revision: 5c7544fcfa=======================================================
,--./,-. ___ __ __ __ ___ /,-._.--~' |\ | |__ __ / ` / \ |__) |__ } { | \| | \__, \__/ | \ |___ \`-._,-`-, `._,._,'
MPRAflow v2.3"
=======================================================
Pipeline Name : MPRAflow
Pipeline Version: 2.3
Fastq insert : /path/to/folder/MPRAflow_tests/Assoc_Basic/data/SRR10800986_1.fastq.gz
fastq paired : /path/to/folder/MPRAflow_tests/Assoc_Basic/data/SRR10800986_3.fastq.gz
Fastq barcode : /path/to/folder/MPRAflow_tests/Assoc_Basic/data/SRR10800986_2.fastq.gz
design fasta : /path/to/folder/MPRAflow_tests/Assoc_Basic/data/design.fa
minimum BC cov : 3
map quality : 30
base quality : 30
cigar string : n
min % mapped : 0.5
Output dir : outs
Run name : assoc_basic
Working dir : /path/to/folder/MPRAflow_tests/Assoc_Basic/work
Container Engine: null
Current home : /cobelix/sebastiennin
Current user : sebastiennin
Current path : /path/to/MPRAflow/MPRAflow
base directory : /path/to/MPRAflow/MPRAflow
Script dir : /path/to/MPRAflow/MPRAflow
Config Profile : standard
=========================================
executor > pbs (33)
[3e/3256d2] process > count_bc_nolab [100%] 1 of 1 ✔
[9e/ddc5d2] process > create_BWA_ref [ 0%] 0 of 1 [a4/414cab] process > PE_merge [ 0%] 0 of 31
executor > pbs (33) [3e/3256d2] process > count_bc_nolab [100%] 1 of 1 ✔ [9e/ddc5d2] process > create_BWA_ref [100%] 1 of 1, failed: 1 ✘
[a4/414cab] process > PE_merge [ 0%] 0 of 31 executor > pbs (33)
[3e/3256d2] process > count_bc_nolab [100%] 1 of 1 ✔
[9e/ddc5d2] process > create_BWA_ref [100%] 1 of 1, failed: 1 ✘
[a4/414cab] process > PE_merge [ 0%] 0 of 1
executor > pbs (33)
[3e/3256d2] process > count_bc_nolab [100%] 1 of 1 ✔
[9e/ddc5d2] process > create_BWA_ref [100%] 1 of 1, failed: 1 ✘
[- ] process > PE_merge -
executor > pbs (33) [3e/3256d2] process > count_bc_nolab [100%] 1 of 1 ✔ [9e/ddc5d2] process > create_BWA_ref [100%] 1 of 1, failed: 1 ✘
[- ] process > PE_merge -
[- ] process > align_BWA_PE -
[- ] process > collect_chunks -
[- ] process > map_element_barcodes -
[- ] process > filter_barcodes -
WARN: Killing pending tasks (30)
Error executing process > 'create_BWA_ref (make ref)'
Caused by:
Missing output file(s)
design_rmIllegalChars.fa.fai
expected by processcreate_BWA_ref (make ref)
Command executed:
!/bin/bash
bwa index -a bwtsw design_rmIllegalChars.fa
samtools faidx design_rmIllegalChars.fa
picard CreateSequenceDictionary REFERENCE=design_rmIllegalChars.fa OUTPUT=design_rmIllegalChars.fa".dict"
Command exit status:
0
Command output:
(empty)
Command error:
[bwa_index] Pack FASTA... 0.00 sec
[bwa_index] Construct BWT for the packed sequence...
.command.sh: line 2: 27236 Floating point exception(core dumped) bwa index -a bwtsw design_rmIllegalChars.fa
[faidx] Could not build fai index design_rmIllegalChars.fa.fai
INFO**2021-07-09 09:36:46matioCreateSequenceDictionary
** NOTE: Picard's command line syntax is changing.
** For more information, please see:
** The command line looks like this in the new syntax:
** CreateSequenceDictionary -REFERENCE design_rmIllegalChars.fa -OUTPUT design_rmIllegalChars.fa.dict
09:36:48.874 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/path/to/folder/MPRAflow_tests/Assoc_Basic/work/conda/mpraflow_py36-1978c54da7aacd41df3c7a4cb763979
5/share/picard-2.20.8-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Fri Jul 09 09:36:48 CEST 2021] CreateSequenceDictionary OUTPUT=design_rmIllegalChars.fa.dict REFERENCE=design_rmIllegalChars.fa TRUNCATE_NAMES_AT_WHITESPACE=true NUM_SEQUENCES=2147483647 VERBOSITY=I
NFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USEJDK
INFLATER=false
[Fri Jul 09 09:36:48 CEST 2021] Executing as @.*** on Linux 2.6.32-504.12.2.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_152-release-1056-b12; Deflater: Intel; Inflater: Intel; Provider G
CS is not available; Picard version: 2.20.8-SNAPSHOT
[Fri Jul 09 09:36:48 CEST 2021] picard.sam.CreateSequenceDictionary done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=514850816
Work dir:
/path/to/folder/MPRAflow_tests/Assoc_Basic/work/9e/ddc5d268d37fdee356a063082b1183
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named
.command.sh
My 'design_rmIllegalChars.fa' file is empty and thus BWA ref can't be build. Can you help me solving this issue? I should miss something.
Have a nice day, Sebastien
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/shendurelab/MPRAflow/issues/45, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGWPMEIHLZ3G3JU5OUTSBDTW2SQHANCNFSM5ACHTRCA .
Here is my cluster config. Do you see something wrong with it? My HPC is running with Torque scheduler.
process { withLabel: longtime { executor='pbs' queue='lifescope' memory='80GB' clusterOptions = '-V -S /bin/bash -l nodes=1:ppn=1,walltime=72:00:00' } withLabel: shorttime { executor='pbs' queue='lifescope' memory='80GB' clusterOptions = '-V -S /bin/bash -l nodes=1:ppn=1,walltime=01:00:00' } withLabel: highmem { executor='pbs' queue='lifescope' memory='80GB' clusterOptions = '-V -S /bin/bash -l nodes=1:ppn=1,walltime=20:00:00' } }
Hi Max,
Here is the results of a "ls -l"
command on the working folder for MPRAflow
-rw-rw-r-- 1 sebastiennin user 6 Jul 12 15:57 count_fastq.txt
-rw-rw-r-- 1 sebastiennin user 0 Jul 12 15:57 design_rmIllegalChars.fa
-rw-rw-r-- 1 sebastiennin user 459833 Jul 12 15:57 label_rmIllegalChars.txt
-rw-rw-r-- 1 sebastiennin user 459833 Jul 12 15:57 labels.txt
lrwxrwxrwx 1 sebastiennin user 62 Jul 12 15:57 SNP_MPRA_design.fa -> path_to_analysis_folder/SNP_MPRA_design.fa
-rw-rw-r-- 1 sebastiennin user 459833 Jul 12 15:57 t_new_label.txt
As you can see, the design_rmIllegalChars.fa
file is empty. Thus BWA can't index it.
Did you already see this behavior before?
Have a nice day, Regards,
Sébastien
Hi Sebastien,
Sorry for my late response. The memory is definitely fine. Yes you are right. The error seems to be the empty reference file. Can you send me your reference file that you used? So I can check on my site what happened.
You find my mail on this website: https://kircherlab.bihealth.org/
Best, Max
Hi Max, I'm following the tutorial provided on this page
If I understand right, the reference is build using the design.fa containing CRS sequences, am I right?
If yes, I did the following:
mkdir -p Assoc_Basic/data
cd Assoc_Basic/data
wget ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM4237nnn/GSM4237954/suppl/GSM4237954_9MPRA_elements.fa.gz
zcat GSM4237954_9MPRA_elements.fa.gz |awk '{ count+=1; if (count == 1) { print } else { print substr($1,1,171)}; if (count == 2) { count=0 } }' > design.fa
Regards, Sébastien
Hi Sebastien,
yes you are right. I could't reproduce the error. Is your initial design.fa
file that you downloaded with the previous command from the tutorial empty? Because reported the same issue there must be something. I will try to find it. Maybe you can try to use the v2.2 release and see if you have the same issue. Because when I remember correcty I canged something here.
Best, Max
Hi Max,
Yes, I got the same error as Sebastien posted.
This is the design.fa file I got using the example command line, and the same for me, the design_rmIllegalChars.fa is empty.
############ design.fa
Hi Max, my design.fa file is not empty, as mentionned by xunchen85.
the head of my file is
head MPRAflow_tests/Assoc_Basic/data/design.fa
>R:FOXA1-ChMod_chr7:32709843-32710013_[chr7:32709842-32710013]:012
AAGGGATAATTTAAAAGTTCCAGTAAAAGTATTGCATGCGGTACAATAAACCAAAGTCCAAGTAGGCAGCAGTGACTGGGCAGCTATCAGTCAATAATGAGACACTCCACAGGGGCATTGTTCTGTCTGCCCCAGGATGACTCATCAGCCACACTCACTGCCCACTGTTTT
>R:FOXA1-ChMod_chr16:78318895-78319032_[chr16:78318878-78319049]:079
TGATCTTTCTGAAATAGGCATGCATGTAATGATGATGTCATTAATGCTTGGCTAGCTGGTGGACTTAAACCCAGAGGGCACTTCTGAAAAGGGGCAAAGTGCATCTGCTTCTGCTTTGTTTATAGACTGTCAGCCTTGGATCTGTCACTCCCTCAGAAGGGAAGGATTGAG
>R:HNF4A-ChMod_chr5:137828875-137829045_[chr5:137828874-137829045]:081
ACAAACAGGGACTGGATCTCAGCACAGAGGGCTGCCAGCAACAGTTCCCGAGCCCCCTCCCCCCATGTTCCAGCAGGACAGCTGTCACAAAGTCCAGCTTTCTGCTGGGGAGGAGACAAGCAAGTCCCCATGTGGCCAGCTAGACCCGCCTGTGAGCCTGTGATTGTTCTG
>R:HNF4A-NoMod_chrY:18213828-18213963_[chrY:18213810-18213981]:029
TTCTTCCTGACAAAGTGACAGCCTAAAAGATCAGATTGCAGCCTAGTTAAGGAGGCAAAGTCCACTACAAAGAGGCCTTCCTGTGTAACTAGCAAGGGTCATGTATACACAGTAGGCATCAGTGAGCACATTGCTTTTCTTTTTTGGACATACTTAGTTAAGGAAATATGC
>A:HNF4A-ChMod_chr10:72112555-72112707_[chr10:72112545-72112716]:010
CCATTTTTAAATGTACAGTTCAGTAGCTTTAAGTATATTTACATTGTTGTGCAATCAACTAATCTCCAGGACTTTTGCATCTTGCGAAACGGAAACTCTTTACTTGTTAACCCCCTATTTTCCCATCCCCCAGCTGCTGGCAACCACAGAACATTATAAACTTTTTTCCAG
If I look the head of the label_rmIllegaChars.txt I got the following
head MPRAflow_tests/Assoc_Basic/work/3e/3256d22213084246f349db64782912/label_rmIllegalChars.txt
R:FOXA1-ChMod_chr7:32709843-32710013__chr7:32709842-32710013_:012 na
R:FOXA1-ChMod_chr16:78318895-78319032__chr16:78318878-78319049_:079 na
R:HNF4A-ChMod_chr5:137828875-137829045__chr5:137828874-137829045_:081 na
R:HNF4A-NoMod_chrY:18213828-18213963__chrY:18213810-18213981_:029 na
A:HNF4A-ChMod_chr10:72112555-72112707__chr10:72112545-72112716_:010 na
R:FOXA1-NoMod_chr8:30358109-30358253__chr8:30358095-30358266_:041 na
A:HNF4A-ChMod_chr5:82272787-82272893__chr5:82272754-82272925_:006 na
R:EP300-NoMod_chr3:57741879-57742050__chr3:57741879-57742050_:078 na
A:HNF4A-ChMod_chr9:111538379-111538538__chr9:111538373-111538544_:087 na
A:HNF4A-NoMod_chr5:125808085-125808153__chr5:125808033-125808204_:098 na
And finally my design_rmIllegalChars.fa if empty.
Regards, Sebastien
+1 I'm getting the same problem on my end...an empty design_rmIllegalChars.fa file and then downstream problems as a result.
Ok. I found the issue. Thanks a lot for your help. can you try the new bugfix version 2.3.1?
Hi Max,
It works and thanks for fixing the bug.
Thanks, Xun
Hi,
I'm trying to perform the basic association workflow using the tutorial from https://mpraflow.readthedocs.io/en/latest/association_example1.html
I downloaded the fastq file using the sra-toolkit command
Here is my data folder
I got errors when I run the following commands:
Here is the terminal output
My 'design_rmIllegalChars.fa' file is empty and thus BWA ref can't be build. Can you help me solving this issue? I should miss something.
Have a nice day, Sebastien