arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
305 stars 119 forks source link

Segmentation fault error in a subset of samples using LUMPY Express #129

Open AAvalos82 opened 8 years ago

AAvalos82 commented 8 years ago

Hi,

I have been trying to use lumpyexpress on a set of 90 haploid genomes from honey bee drones. These genomes were aligned with BWA-MEM and did include the -M flag. However, although 80 of these samples ran with no issues, 10 of these encountered a segmentation fault error that lead to a core dump. The samples in question do not associate by population and alignments were done in separate parallel batches, so they also do not correlate with possible processing pipeline errors.

The analysis does produce readable vcf files but these are corrupted with a break around the second set of scaffolds (Group10.*). The error was consistent across these 10 samples generating similar corrupted vcf files.

Any help resolving this issue would be greatly appreciated.

Example command input from lumpyexpress -v flag as follows:

Sourcing executables from /home/apps/lumpy-sv/lumpy-sv-0.2.13/scripts/lumpyexpress.config ...

Checking for required python modules (/home/apps/python/python-2.7.3/bin/python)...

create temporary directory

Warning: The index file is older than the data file: /home/a-m/aavalos/2015_12_hb_aggression_popgen/data/2016_5_11_realign_bam/W197_WHAIPI005553-18_EHB.realigned.bai Calculating insert distributions... Library read groups: W197_WHAIPI005553-18_EHB Library read length: 100 Removed 1890 outliers with isize >= 586 done 0 Running LUMPY...

/home/apps/lumpy-sv/lumpy-sv-0.2.13/bin/lumpy -P \ -t W197_WHAIPI005553-18_EHB.bam.vcf.g71iro19am1r/W197_WHAIPI005553-18_EHB.bam.vcf \ -msw 4 \ -tt 0 \ -x /home/a-m/aavalos/2015_12_hb_aggression_popgen/data/ref/numt.bed \ -pe bam_file:/home/a-m/aavalos/2015_12_hb_aggression_popgen/data/2016_7_5_discord_split/discordant_reads_bam/filtered_dis/W197_WHAIPI005553-18_EHB.filtered_disc.bam,histo_file:W197_WHAIPI005553-18_EHB.bam.vcf.g71iro19am1r/W197_WHAIPI005553-18_EHB.bam.vcf.sample1.lib1.x4.histo,mean:453.706583919,stdev:78.806904399,read_length:100,min_non_overlap:100,discordant_z:5,back_distance:10,weight:1,id:W197_WHAIPI005553-18_EHB,min_mapping_threshold:20,read_group:W197_WHAIPI005553-18_EHB \ -sr bam_file:/home/a-m/aavalos/2015_12_hb_aggression_popgen/data/2016_7_5_discord_split/split_reads_bam/filtered_split/W197_WHAIPI005553-18_EHB.filtered_split.bam,back_distance:10,min_mapping_threshold:20,weight:1,id:W197_WHAIPI005553-18_EHB,min_clip:20 \

W197_WHAIPI005553-18_EHB.bam.vcf 496 0 Group1.1 1000000 Group1.10 1000000 Group1.11 1000000 ... GroupUn993 1000000 GroupUn994 1000000 GroupUn995 1000000 GroupUn997 1000000 /home/apps/lumpy-sv/lumpy-sv-0.2.13/scripts/lumpyexpress: line 411: 79691 Segmentation fault (core dumped) $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $LUMPY_SPL_STRING > $OUTPUT

ryanlayer commented 8 years ago

To dig into this we will have to extract the problematic region of the BAM file. Please email me directly ryan dot layer at gmail to work out the particulars.

On Thu, Jul 21, 2016 at 12:49 PM, AAvalos82 notifications@github.com wrote:

Hi,

I have been trying to use lumpyexpress on a set of 90 haploid genomes from honey bee drones. These genomes were aligned with BWA-MEM and did include the -M flag. However, although 80 of these samples ran with no issues, 10 of these encountered a segmentation fault error that lead to a core dump. The samples in question do not associate by population and alignments were done in separate parallel batches, so they also do not correlate with possible processing pipeline errors.

The analysis does produce readable *.vcf files but these are corrupted with a break around the second set of scaffolds (Group10.). The error was consistent across these 10 samples generating similar corrupted vcf files.

Any help resolving this issue would be greatly appreciated.

Example command input from lumpyexpress -v flag as follows:

Sourcing executables from /home/apps/lumpy-sv/lumpy-sv-0.2.13/scripts/lumpyexpress.config ...

Checking for required python modules (/home/apps/python/python-2.7.3/bin/python)...

create temporary directory

Warning: The index file is older than the data file: /home/a-m/aavalos/2015_12_hb_aggression_popgen/data/2016_5_11_realign_bam/W197_WHAIPI005553-18_EHB.realigned.bai Calculating insert distributions... Library read groups: W197_WHAIPI005553-18_EHB Library read length: 100 Removed 1890 outliers with isize >= 586 done 0 Running LUMPY...

/home/apps/lumpy-sv/lumpy-sv-0.2.13/bin/lumpy -P \ -t W197_WHAIPI005553-18_EHB.bam.vcf.g71iro19am1r/W197_WHAIPI005553-18_EHB.bam.vcf \ -msw 4 \ -tt 0 \ -x /home/a-m/aavalos/2015_12_hb_aggression_popgen/data/ref/numt.bed \ -pe bam_file:/home/a-m/aavalos/2015_12_hb_aggression_popgen/data/2016_7_5_discord_split/discordant_reads_bam/filtered_dis/W197_WHAIPI005553-18_EHB.filtered_disc.bam,histo_file:W197_WHAIPI005553-18_EHB.bam.vcf.g71iro19am1r/W197_WHAIPI005553-18_EHB.bam.vcf.sample1.lib1.x4.histo,mean:453.706583919,stdev:78.806904399,read_length:100,min_non_overlap:100,discordant_z:5,back_distance:10,weight:1,id:W197_WHAIPI005553-18_EHB,min_mapping_threshold:20,read_group:W197_WHAIPI005553-18_EHB \ -sr bam_file:/home/a-m/aavalos/2015_12_hb_aggression_popgen/data/2016_7_5_discord_split/split_reads_bam/filtered_split/W197_WHAIPI005553-18_EHB.filtered_split.bam,back_distance:10,min_mapping_threshold:20,weight:1,id:W197_WHAIPI005553-18_EHB,min_clip:20 \

W197_WHAIPI005553-18_EHB.bam.vcf 496 0 Group1.1 1000000 Group1.10 1000000 Group1.11 1000000 ... GroupUn993 1000000 GroupUn994 1000000 GroupUn995 1000000 GroupUn997 1000000 /home/apps/lumpy-sv/lumpy-sv-0.2.13/scripts/lumpyexpress: line 411: 79691 Segmentation fault (core dumped) $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $LUMPY_SPL_STRING > $OUTPUT

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDURmyH771z31nMhMyPwrLTa4yoPVqks5qX79KgaJpZM4JSGjo .

Ryan Layer

amaiacc commented 7 years ago

Hi,

I got the same segmentation as #129 fault error leading to a core dump. It only happens on a subset of my samples (6/52, human whole genomes, aligned with BWA-MEM, -M).

The output of these samples contains readable vcf headers without any calls.

I would greatly appreciate any help solving this issue.

Thanks in advance,

amaia

lumpyexpress \
>     -B t0_397.reordered.bam \
>     -S t0_397/t0_397.splitters.bam \
>     -D t0_397/t0_397.discordants.bam \
>     -T ./tmp \
>     -v -k \
>     -o BILGIN_t0_397_k_SV_lumpy.vcf
Sourcing executables from /data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/apps/python-2.7.11/bin/python)...

    create temporary directory
Calculating insert distributions...
Library read groups: HKAIPI000472-70,HKAIPI000472-70.1,HKAIPI000472-70.2,HKAIPI000472-70.3
Library read length: 90
Removed 130 outliers with isize >= 557
Library read groups: HKAIPI000471-71
Library read length: 90
Removed 68 outliers with isize >= 558
done
0
0
Running LUMPY...

/data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpy  \
    -t ./tmp/BILGIN_t0_397_k_SV_lumpy.vcf \
    -msw 4 \
    -tt 0 \
     \
     \
     -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam,histo_file:./tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib1.x4.histo,mean:457.712174187,stdev:52.7052111789,read_length:90,min_non_overlap:90,discordant_z:5,back_distance:10,weight:1,id:t0_397,min_mapping_threshold:20,read_group:HKAIPI000472-70,read_group:HKAIPI000472-70.1,read_group:HKAIPI000472-70.2,read_group:HKAIPI000472-70.3 -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam,histo_file:./tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib2.x4.histo,mean:457.844237813,stdev:53.707274301,read_length:90,min_non_overlap:90,discordant_z:5,back_distance:10,weight:1,id:t0_397,min_mapping_threshold:20,read_group:HKAIPI000471-71 \
     -sr bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.splitters.bam,back_distance:10,min_mapping_threshold:20,weight:1,id:t0_397,min_clip:20 \
    > BILGIN_t0_397_k_SV_lumpy.vcf
474     0
469     0
chrM    1000000
/data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress: line 480: 56317 Segmentation fault      $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $LUMPY_DEPTH_STRING $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $EXCLUDE_BED_FMT $LUMPY_SPL_STRING > $OUTPUT
ryanlayer commented 7 years ago

Can you check to see how big this file is /data/corpora/sge2/ lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0397/t0 397.discordants.bam?

On Tue, May 9, 2017 at 2:21 AM, amaiacc notifications@github.com wrote:

Hi,

I got the same segmentation fault error leading to a core dump. It only happens on a subset of my samples (6/52, human whole genomes, aligned with BWA-MEM, -M).

The output of these samples contains readable vcf headers without any calls.

I would greatly appreciate any help solving this issue.

Thanks in advance,

amaia

lumpyexpress \

-B t0_397.reordered.bam \
-S t0_397/t0_397.splitters.bam \
-D t0_397/t0_397.discordants.bam \
-T ./tmp \
-v -k \
-o BILGIN_t0_397_k_SV_lumpy.vcf

Sourcing executables from /data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/apps/python-2.7.11/bin/python)...

create temporary directory

Calculating insert distributions... Library read groups: HKAIPI000472-70,HKAIPI000472-70.1,HKAIPI000472-70.2,HKAIPI000472-70.3 Library read length: 90 Removed 130 outliers with isize >= 557 Library read groups: HKAIPI000471-71 Library read length: 90 Removed 68 outliers with isize >= 558 done 0 0 Running LUMPY...

/data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpy \ -t ./tmp/BILGIN_t0_397_k_SV_lumpy.vcf \ -msw 4 \ -tt 0 \ \ \ -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam,histo_file:./tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib1.x4.histo,mean:457.712174187,stdev:52.7052111789,read_length:90,min_non_overlap:90,discordant_z:5,back_distance:10,weight:1,id:t0_397,min_mapping_threshold:20,read_group:HKAIPI000472-70,read_group:HKAIPI000472-70.1,read_group:HKAIPI000472-70.2,read_group:HKAIPI000472-70.3 -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam,histo_file:./tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib2.x4.histo,mean:457.844237813,stdev:53.707274301,read_length:90,min_non_overlap:90,discordant_z:5,back_distance:10,weight:1,id:t0_397,min_mapping_threshold:20,read_group:HKAIPI000471-71 \ -sr bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.splitters.bam,back_distance:10,min_mapping_threshold:20,weight:1,id:t0_397,min_clip:20 \

BILGIN_t0_397_k_SV_lumpy.vcf 474 0 469 0 chrM 1000000 /data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress: line 480: 56317 Segmentation fault $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $LUMPY_DEPTH_STRING $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $EXCLUDE_BED_FMT $LUMPY_SPL_STRING > $OUTPUT

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-300095475, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDUU2tKnMCE4cHPlp5GPlR1b-mzjsuks5r4CIXgaJpZM4JSGjo .

-- Ryan Layer

amaiacc commented 7 years ago

The file is 756M. Other samples that ran without problems had similar sized discordants bam files (ranging: 373M-1.1G).

Thanks, amaia

2017-06-06 19:20 GMT+02:00 Ryan Layer notifications@github.com:

Can you check to see how big this file is /data/corpora/sge2/ lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0397/t0 397.discordants.bam?

On Tue, May 9, 2017 at 2:21 AM, amaiacc notifications@github.com wrote:

Hi,

I got the same segmentation fault error leading to a core dump. It only happens on a subset of my samples (6/52, human whole genomes, aligned with BWA-MEM, -M).

The output of these samples contains readable vcf headers without any calls.

I would greatly appreciate any help solving this issue.

Thanks in advance,

amaia

lumpyexpress \

-B t0_397.reordered.bam \ -S t0_397/t0_397.splitters.bam \ -D t0_397/t0_397.discordants.bam \ -T ./tmp \ -v -k \ -o BILGIN_t0_397_k_SV_lumpy.vcf Sourcing executables from /data/corpora/MPI_workspace/ lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/apps/python-2.7. 11/bin/python)...

create temporary directory Calculating insert distributions... Library read groups: HKAIPI000472-70,HKAIPI000472- 70.1,HKAIPI000472-70.2,HKAIPI000472-70.3 Library read length: 90 Removed 130 outliers with isize >= 557 Library read groups: HKAIPI000471-71 Library read length: 90 Removed 68 outliers with isize >= 558 done 0 0 Running LUMPY...

/data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpy \ -t ./tmp/BILGIN_t0_397_k_SV_lumpy.vcf \ -msw 4 \ -tt 0 \ \ \ -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.discordants.bam,histo file:./tmp/BILGIN_t0_397_k_SVlumpy.vcf.sample1.lib1.x4. histo,mean:457.712174187,stdev:52.7052111789,read length:90,min_non_overlap:90,discordant_z:5,back_distance: 10,weight:1,id:t0_397,min_mappingthreshold:20,read group:HKAIPI000472-70,readgroup:HKAIPI000472-70.1,read group:HKAIPI000472-70.2,read_group:HKAIPI000472-70.3 -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.discordants.bam,histo file:./tmp/BILGIN_t0_397_k_SVlumpy.vcf.sample1.lib2.x4. histo,mean:457.844237813,stdev:53.707274301,read length:90,min_non_overlap:90,discordant_z:5,back_distance: 10,weight:1,id:t0_397,min_mapping_threshold:20,read_group:HKAIPI000471-71 \ -sr bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.splitters.bam,back distance:10,min_mapping_threshold:20,weight:1,id:t0_397,min_clip:20 \

BILGIN_t0_397_k_SV_lumpy.vcf 474 0 469 0 chrM 1000000 /data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress: line 480: 56317 Segmentation fault $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $LUMPY_DEPTH_STRING $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $EXCLUDE_BED_FMT $LUMPY_SPL_STRING > $OUTPUT

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-300095475, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAlDUU2tKnMCE4cHPlp5GPlR1b-mzjsuks5r4CIXgaJpZM4JSGjo .

-- Ryan Layer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-306556740, or mute the thread https://github.com/notifications/unsubscribe-auth/APGbb9PeDxb5pNgd2U_dWc9OVfjPXLTGks5sBYppgaJpZM4JSGjo .

ryanlayer commented 7 years ago

Do you think that you can send over these files:

/data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam /tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib1.x4.histo /data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam /tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib2.x4.histo /data/corpora/sge2/lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0_397/t0_397.splitters.bam

On Tue, Jun 6, 2017 at 11:52 AM, amaiacc notifications@github.com wrote:

The file is 756M. Other samples that ran without problems had similar sized discordants bam files (ranging: 373M-1.1G).

Thanks, amaia

2017-06-06 19:20 GMT+02:00 Ryan Layer notifications@github.com:

Can you check to see how big this file is /data/corpora/sge2/ lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0397/t0 397.discordants.bam?

On Tue, May 9, 2017 at 2:21 AM, amaiacc notifications@github.com wrote:

Hi,

I got the same segmentation fault error leading to a core dump. It only happens on a subset of my samples (6/52, human whole genomes, aligned with BWA-MEM, -M).

The output of these samples contains readable vcf headers without any calls.

I would greatly appreciate any help solving this issue.

Thanks in advance,

amaia

lumpyexpress \

-B t0_397.reordered.bam \ -S t0_397/t0_397.splitters.bam \ -D t0_397/t0_397.discordants.bam \ -T ./tmp \ -v -k \ -o BILGIN_t0_397_k_SV_lumpy.vcf Sourcing executables from /data/corpora/MPI_workspace/ lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/apps/python-2.7. 11/bin/python)...

create temporary directory Calculating insert distributions... Library read groups: HKAIPI000472-70,HKAIPI000472-

70.1,HKAIPI000472-70.2,HKAIPI000472-70.3

Library read length: 90 Removed 130 outliers with isize >= 557 Library read groups: HKAIPI000471-71 Library read length: 90 Removed 68 outliers with isize >= 558 done 0 0 Running LUMPY...

/data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/ lumpy-sv/bin/lumpy \ -t ./tmp/BILGIN_t0_397_k_SV_lumpy.vcf \ -msw 4 \ -tt 0 \ \ \ -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.discordants.bam,histo file:./tmp/BILGIN_t0_397_k_SVlumpy.vcf.sample1.lib1.x4. histo,mean:457.712174187,stdev:52.7052111789 <(705)%20211-1789>,read length:90,min_non_overlap:90,discordant_z:5,back_distance: 10,weight:1,id:t0_397,min_mappingthreshold:20,read group:HKAIPI000472-70,readgroup:HKAIPI000472-70.1,read group:HKAIPI000472-70.2,read_group:HKAIPI000472-70.3 -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.discordants.bam,histo file:./tmp/BILGIN_t0_397_k_SVlumpy.vcf.sample1.lib2.x4. histo,mean:457.844237813,stdev:53.707274301,read length:90,min_non_overlap:90,discordant_z:5,back_distance: 10,weight:1,id:t0_397,min_mappingthreshold:20,read group:HKAIPI000471-71 \ -sr bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.splitters.bam,back distance:10,min_mapping_threshold:20,weight:1,id:t0_397,min_clip:20 \

BILGIN_t0_397_k_SV_lumpy.vcf 474 0 469 0 chrM 1000000 /data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/ lumpy-sv/bin/lumpyexpress: line 480: 56317 Segmentation fault $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $LUMPY_DEPTH_STRING $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $EXCLUDE_BED_FMT $LUMPY_SPL_STRING > $OUTPUT

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-300095475, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAlDUU2tKnMCE4cHPlp5GPlR1b-mzjsuks5r4CIXgaJpZM4JSGjo .

-- Ryan Layer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-306556740, or mute the thread https://github.com/notifications/unsubscribe-auth/APGbb9PeDxb5pNgd2U_ dWc9OVfjPXLTGks5sBYppgaJpZM4JSGjo .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-306565744, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDUamCOiUgjC4BmZOEfIWFc9b1o6n8ks5sBZHsgaJpZM4JSGjo .

-- Ryan Layer

amaiacc commented 7 years ago

Dear Ryan, I've uploaded the files here https://owncloud.gwdg.de/index.php/s/kSmRE86MN8cTJb0. I've noticed that a subset of my samples do not contain any split reads, splitters.bam is empty. However, some of these samples do create SV calls despite not having any split reads. I've included one of these samples without SR, but with SV calls in the vcf file (BILGIN_t0_508_SV_lumpy.vcf). Thanks in advance for your help, amaia

2017-06-06 20:19 GMT+02:00 Ryan Layer notifications@github.com:

Do you think that you can send over these files:

/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam /tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib1.x4.histo /data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0_397.discordants.bam /tmp/BILGIN_t0_397_k_SV_lumpy.vcf.sample1.lib2.x4.histo /data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0_397.splitters.bam

On Tue, Jun 6, 2017 at 11:52 AM, amaiacc notifications@github.com wrote:

The file is 756M. Other samples that ran without problems had similar sized discordants bam files (ranging: 373M-1.1G).

Thanks, amaia

2017-06-06 19:20 GMT+02:00 Ryan Layer notifications@github.com:

Can you check to see how big this file is /data/corpora/sge2/ lag/projects/lg-hand/working/Bordeaux/SV/lumpy/t0397/t0 397.discordants.bam?

On Tue, May 9, 2017 at 2:21 AM, amaiacc notifications@github.com wrote:

Hi,

I got the same segmentation fault error leading to a core dump. It only happens on a subset of my samples (6/52, human whole genomes, aligned with BWA-MEM, -M).

The output of these samples contains readable vcf headers without any calls.

I would greatly appreciate any help solving this issue.

Thanks in advance,

amaia

lumpyexpress \

-B t0_397.reordered.bam \ -S t0_397/t0_397.splitters.bam \ -D t0_397/t0_397.discordants.bam \ -T ./tmp \ -v -k \ -o BILGIN_t0_397_k_SV_lumpy.vcf Sourcing executables from /data/corpora/MPI_workspace/ lag/shared_spaces/Resource_DB/lumpy-sv/bin/lumpyexpress.config ...

Checking for required python modules (/usr/local/apps/python-2.7. 11/bin/python)...

create temporary directory Calculating insert distributions... Library read groups: HKAIPI000472-70,HKAIPI000472-

70.1,HKAIPI000472-70.2,HKAIPI000472-70.3

Library read length: 90 Removed 130 outliers with isize >= 557 Library read groups: HKAIPI000471-71 Library read length: 90 Removed 68 outliers with isize >= 558 done 0 0 Running LUMPY...

/data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/ lumpy-sv/bin/lumpy \ -t ./tmp/BILGIN_t0_397_k_SV_lumpy.vcf \ -msw 4 \ -tt 0 \ \ \ -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.discordants.bam,histo file:./tmp/BILGIN_t0_397_k_SVlumpy.vcf.sample1.lib1.x4. histo,mean:457.712174187,stdev:52.7052111789 <(705)%20211-1789>,read length:90,min_non_overlap:90,discordant_z:5,back_distance: 10,weight:1,id:t0_397,min_mappingthreshold:20,read group:HKAIPI000472-70,readgroup:HKAIPI000472-70.1,read group:HKAIPI000472-70.2,read_group:HKAIPI000472-70.3 -pe bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.discordants.bam,histo file:./tmp/BILGIN_t0_397_k_SVlumpy.vcf.sample1.lib2.x4. histo,mean:457.844237813,stdev:53.707274301,read length:90,min_non_overlap:90,discordant_z:5,back_distance: 10,weight:1,id:t0_397,min_mappingthreshold:20,read group:HKAIPI000471-71 \ -sr bam_file:/data/corpora/sge2/lag/projects/lg-hand/working/ Bordeaux/SV/lumpy/t0_397/t0397.splitters.bam,back distance:10,min_mapping_threshold:20,weight:1,id:t0_397,min_clip:20 \

BILGIN_t0_397_k_SV_lumpy.vcf 474 0 469 0 chrM 1000000 /data/corpora/MPI_workspace/lag/shared_spaces/Resource_DB/ lumpy-sv/bin/lumpyexpress: line 480: 56317 Segmentation fault $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $LUMPY_DEPTH_STRING $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $EXCLUDE_BED_FMT $LUMPY_SPL_STRING > $OUTPUT

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-300095475 , or mute the thread https://github.com/notifications/unsubscribe-auth/ AAlDUU2tKnMCE4cHPlp5GPlR1b-mzjsuks5r4CIXgaJpZM4JSGjo .

-- Ryan Layer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-306556740, or mute the thread https://github.com/notifications/unsubscribe-auth/APGbb9PeDxb5pNgd2U_ dWc9OVfjPXLTGks5sBYppgaJpZM4JSGjo .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-306565744, or mute the thread https://github.com/notifications/unsubscribe-auth/ AAlDUamCOiUgjC4BmZOEfIWFc9b1o6n8ks5sBZHsgaJpZM4JSGjo

.

-- Ryan Layer

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/129#issuecomment-306573321, or mute the thread https://github.com/notifications/unsubscribe-auth/APGbb96iK6_o3Gb1jZE3hbwyUnRxH-yhks5sBZglgaJpZM4JSGjo .

rbatorsky commented 7 years ago

Hello, I'm curious if you found the resolution for this problem? I'm getting a similar error running lumpy express on the synthetic tumor-normal pairs from the ICGC DREAM challenge.

Im running like this: $LUMPY_EXPRESS \ -B $TUMOR_PATH,$NORMAL_PATH \ -S ${TUMOR}.splitters.sorted.bam,${NORMAL}.splitters.sorted.bam \ -D ${TUMOR}.discordants.sorted.bam,${NORMAL}.discordants.sorted.bam \ -o ${TUMOR}_vs_normal.vcf

Here is my output:

Checking for required python modules (/usr/bin/python)... Calculating insert distributions... Library read groups: C09DF.1,C09DF.2,D0EN0.4,D0EN0.7,D0EN0.8 Library read length: 101 Removed 1606 outliers with isize >= 525 done 1 Calculating insert distributions... Library read groups: C09DF.1,C09DF.2,D0EN0.4,D0EN0.7,D0EN0.8 Library read length: 101 Removed 1186 outliers with isize >= 515 done 1 Running LUMPY... 434 0 424 0 1 1000000 2 1000000 3 1000000 4 1000000 5 1000000 6 1000000 7 1000000 8 1000000 9 1000000 10 1000000 11 1000000 12 1000000 13 1000000 13 2000000 13 4000000 13 8000000 13 16000000 13 32000000 14 1000000 14 2000000 14 4000000 14 8000000 14 16000000 14 32000000 15 1000000 15 2000000 15 4000000 15 8000000 15 16000000 15 32000000 16 1000000 17 1000000 18 1000000 19 1000000 20 1000000 21 1000000 21 2000000 21 4000000 21 8000000 21 16000000 22 1000000 22 2000000 22 4000000 22 8000000 22 16000000 22 32000000 X 1000000 Y 1000000 Y 2000000 Y 4000000 MT 1000000 GL000207.1 1000000 GL000226.1 1000000 GL000229.1 1000000 GL000231.1 1000000 GL000210.1 1000000 GL000239.1 1000000 GL000235.1 1000000 GL000201.1 1000000 GL000247.1 1000000 GL000245.1 1000000 GL000197.1 1000000 GL000203.1 1000000 GL000246.1 1000000 GL000249.1 1000000 GL000196.1 1000000 GL000248.1 1000000 GL000244.1 1000000 GL000238.1 1000000 GL000202.1 1000000 GL000234.1 1000000 GL000232.1 1000000 GL000206.1 1000000 GL000240.1 1000000 GL000236.1 1000000 GL000241.1 1000000 GL000243.1 1000000 GL000242.1 1000000 GL000230.1 1000000 GL000237.1 1000000 GL000233.1 1000000 GL000204.1 1000000 GL000198.1 1000000 GL000208.1 1000000 GL000191.1 1000000 GL000227.1 1000000 GL000228.1 1000000 GL000214.1 1000000 GL000221.1 1000000 GL000209.1 1000000 GL000218.1 1000000 GL000220.1 1000000 GL000213.1 1000000 GL000211.1 1000000 GL000199.1 1000000 GL000217.1 1000000 GL000216.1 1000000 GL000215.1 1000000 GL000205.1 1000000 GL000219.1 1000000 GL000224.1 1000000 GL000223.1 1000000 GL000195.1 1000000 GL000212.1 1000000 GL000222.1 1000000 GL000200.1 1000000 GL000193.1 1000000 GL000194.1 1000000 GL000225.1 1000000 GL000192.1 1000000 NC_007605 1000000 lumpyexpress: line 413: 14007 Segmentation fault $LUMPY $PROB_CURVE -t ${TEMP_DIR}/${OUTBASE} -msw $MIN_SAMPLE_WEIGHT -tt $TRIM_THRES $EXCLUDE_BED_FMT $LUMPY_DISC_STRING $LUMPY_SPL_STRING > $OUTPUT

hepcat72 commented 5 years ago

This is the problem I'm having with lumpy express - but only when run via galaxy. I do not get the segfault on my mac. I want to be able to implement my analysis pipeline over to galaxy, so I'm interested in getting this resolved. My 27 bam files (10 of which lead to a segfault) are fairly small, ranging from 1Mb to 2.7Mb.

Sithara85 commented 4 years ago

HI,

I am getting similar error. What's the solution for this? I can not share data as these are real patient samples. I will greatly appreciate your help.

Thanks in advance!

hepcat72 commented 4 years ago

I have worked on this issue some and implemented a workaround that may get around this issue in some cases. See issue #276 and my unmerged pull request #277. You might even try installing my fork with those changes to see if it works around the issue for you.

hepcat72 commented 4 years ago

My fork is likely somewhat behind the current though...

hepcat72 commented 4 years ago

I have worked on this issue some and implemented a workaround that may get around this issue in some cases. See issue #276 and my unmerged pull request #277. You might even try installing my fork with those changes to see if it works around the issue for you.

hepcat72 commented 4 years ago

My fork is likely somewhat behind the current though...