Closed codeunsolved closed 5 years ago
On Mon, Apr 8, 2019 at 10:51 PM codeunsolved notifications@github.com wrote:
Hi, LUMPY heroes:
Recently, I run LUMPY(v0.2.13) on NA12878 for benchmarking. It seems to output some error, but VCF file was successfully generated. So I am not sure the VCF file is valid.
I followed the workflow from README with LUMPY (traditional) usage. Here is my workflow and log: CLICK ME
[INFO] >> 1 BAMs found!
[INFO] >> Caller : lumpy
[INFO] • No.1 **.bam 94.34MB
[INFO] >> Output : /**/TEST/test_lumpy/**
[INFO] • Run lumpy v0.2.13
[INFO] [RUN_LUMPY] SampleID: **, RLEN: 150, min_non_overlap: 101, discordant_z: 5, back_distance: 10, weight: 1, min_mapping_threshold: 20, mw: 4, tt: 0
[INFO] [RUN_LUMPY] Step 0-1: Generate discordants BAM for PE
[INFO] CMD: samtools view -b -F 1294 /**/Downloads/NGS-Data/**/**.bam > /**/TEST/test_lumpy/**/**.discordants.unsorted.bam
[INFO] CMD: samtools sort /**/TEST/test_lumpy/**/**.discordants.unsorted.bam > /**/TEST/test_lumpy/**/**.discordants.bam
[INFO] CMD: rm /**/TEST/test_lumpy/**/**.discordants.unsorted.bam
[INFO] [RUN_LUMPY] Step 0-2: Generate splitters BAM for SR
[INFO] CMD: samtools view -h /**/Downloads/NGS-Data/**/**.bam | /**/miniconda3/envs/py2.7/bin/python /**/Projects/NGS/lumpy-sv/scripts/extractSplitReads_BwaMem -i stdin | samtools view -Sb - > /**/TEST/test_lumpy/**/**.splitters.unsorted.bam
[INFO] CMD: samtools sort /**/TEST/test_lumpy/**/**.splitters.unsorted.bam > /**/TEST/test_lumpy/**/**.splitters.bam
[INFO] CMD: rm /**/TEST/test_lumpy/**/**.splitters.unsorted.bam
[INFO] [RUN_LUMPY] Step 1: Generate histo
[INFO] CMD: samtools view /**/Downloads/NGS-Data/**/**.bam | tail -n 100000 | /**/miniconda3/envs/py2.7/bin/python /**/Projects/NGS/lumpy-sv/scripts/pairend_distro.py -r 150 -X 4 -N 10000 -o /**/TEST/test_lumpy/**/**.histo
[INFO] mean:218.3957 stdev:85.4206071245
[ERROR] Removed 0 outliers with isize >= 844
[INFO] [RUN_LUMPY] histo mean: 218.3957 stdev: 85.4206071245
[INFO] [RUN_LUMPY] Step 2: Run LUMPY with paired-end and split-reads
[INFO] CMD: /**/Projects/NGS/lumpy-sv/bin/lumpy -mw 4 -tt 0 -pe 'id:**,bam_file:/**/TEST/test_lumpy/**/**.discordants.bam,histo_file:/**/TEST/test_lumpy/**/**.histo,mean:218.3957,stdev:85.4206071245,read_length:150,min_non_overlap:101,discordant_z:5,back_distance:10,weight:1,min_mapping_threshold:20' -sr 'id:**,bam_file:/**/TEST/test_lumpy/**/**.splitters.bam,back_distance:10,weight:1,min_mapping_threshold:20' > /**/TEST/test_lumpy/**/**.vcf
[ERROR] 417 0
1 1000000
2 1000000
3 1000000
4 1000000
5 1000000
6 1000000
7 1000000
8 1000000
9 1000000
10 1000000
11 1000000
12 1000000
13 1000000
13 2000000
13 4000000
13 8000000
13 16000000
13 32000000
14 1000000
14 2000000
14 4000000
14 8000000
14 16000000
14 32000000
15 1000000
15 2000000
15 4000000
15 8000000
15 16000000
15 32000000
16 1000000
17 1000000
18 1000000
19 1000000
20 1000000
20 2000000
21 1000000
21 2000000
21 4000000
21 8000000
21 16000000
22 1000000
22 2000000
22 4000000
22 8000000
22 16000000
22 32000000
X 1000000
Y 1000000
Y 2000000
Y 4000000
MT 1000000
GL000226.1 1000000
GL000229.1 1000000
GL000239.1 1000000
GL000245.1 1000000
GL000197.1 1000000
GL000246.1 1000000
GL000232.1 1000000
GL000237.1 1000000
GL000204.1 1000000
GL000198.1 1000000
GL000208.1 1000000
GL000228.1 1000000
GL000214.1 1000000
GL000221.1 1000000
GL000218.1 1000000
GL000220.1 1000000
GL000199.1 1000000
GL000217.1 1000000
GL000216.1 1000000
GL000205.1 1000000
GL000219.1 1000000
GL000224.1 1000000
GL000195.1 1000000
GL000212.1 1000000
GL000222.1 1000000
GL000193.1 1000000
GL000225.1 1000000
GL000192.1 1000000
[INFO] [TIME] Lumpy: 1min 01sec
At the end of log, LUMPY output something to stderr:
417 0 1 1000000 2 1000000 3 1000000 ......
Questions: (1). Is this output ok? or somethings not expected occurred that I should pay attention to.
Looks good to me.
(2). Then, I found all REF in VCF are 'N'. Is that also the expected output?
Yes. That is instead of putting a huge string.
(3). I want to plot ROC using SU in FORMAT as threshold, Is that reasonable? It seems to be the same thing in paper.
Sure.
Many thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/299, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDUXy7RMll_GHU6Meu-l10BU8XHqtJks5vfBxGgaJpZM4cjmbG .
Thank you! thank you for your help! Dr. Layer.
Hi, LUMPY heroes:
Recently, I run LUMPY(v0.2.13) on NA12878 for benchmarking. It seems to output some error, but VCF file was successfully generated. So I am not sure the VCF file is valid.
I followed the workflow from
README
withLUMPY (traditional) usage
. Here is my workflow and log:CLICK ME
``` [INFO] >> 1 BAMs found! [INFO] >> Caller : lumpy [INFO] • No.1 ******.bam 94.34MB [INFO] >> Output : /******/TEST/test_lumpy/****** [INFO] • Run lumpy v0.2.13 [INFO] [RUN_LUMPY] SampleID: ******, RLEN: 150, min_non_overlap: 101, discordant_z: 5, back_distance: 10, weight: 1, min_mapping_threshold: 20, mw: 4, tt: 0 [INFO] [RUN_LUMPY] Step 0-1: Generate discordants BAM for PE [INFO] CMD: samtools view -b -F 1294 /******/Downloads/NGS-Data/******/******.bam > /******/TEST/test_lumpy/******/******.discordants.unsorted.bam [INFO] CMD: samtools sort /******/TEST/test_lumpy/******/******.discordants.unsorted.bam > /******/TEST/test_lumpy/******/******.discordants.bam [INFO] CMD: rm /******/TEST/test_lumpy/******/******.discordants.unsorted.bam [INFO] [RUN_LUMPY] Step 0-2: Generate splitters BAM for SR [INFO] CMD: samtools view -h /******/Downloads/NGS-Data/******/******.bam | /******/miniconda3/envs/py2.7/bin/python /******/Projects/NGS/lumpy-sv/scripts/extractSplitReads_BwaMem -i stdin | samtools view -Sb - > /******/TEST/test_lumpy/******/******.splitters.unsorted.bam [INFO] CMD: samtools sort /******/TEST/test_lumpy/******/******.splitters.unsorted.bam > /******/TEST/test_lumpy/******/******.splitters.bam [INFO] CMD: rm /******/TEST/test_lumpy/******/******.splitters.unsorted.bam [INFO] [RUN_LUMPY] Step 1: Generate histo [INFO] CMD: samtools view /******/Downloads/NGS-Data/******/******.bam | tail -n 100000 | /******/miniconda3/envs/py2.7/bin/python /******/Projects/NGS/lumpy-sv/scripts/pairend_distro.py -r 150 -X 4 -N 10000 -o /******/TEST/test_lumpy/******/******.histo [INFO] mean:218.3957 stdev:85.4206071245 [ERROR] Removed 0 outliers with isize >= 844 [INFO] [RUN_LUMPY] histo mean: 218.3957 stdev: 85.4206071245 [INFO] [RUN_LUMPY] Step 2: Run LUMPY with paired-end and split-reads [INFO] CMD: /******/Projects/NGS/lumpy-sv/bin/lumpy -mw 4 -tt 0 -pe 'id:******,bam_file:/******/TEST/test_lumpy/******/******.discordants.bam,histo_file:/******/TEST/test_lumpy/******/******.histo,mean:218.3957,stdev:85.4206071245,read_length:150,min_non_overlap:101,discordant_z:5,back_distance:10,weight:1,min_mapping_threshold:20' -sr 'id:******,bam_file:/******/TEST/test_lumpy/******/******.splitters.bam,back_distance:10,weight:1,min_mapping_threshold:20' > /******/TEST/test_lumpy/******/******.vcf [ERROR] 417 0 1 1000000 2 1000000 3 1000000 4 1000000 5 1000000 6 1000000 7 1000000 8 1000000 9 1000000 10 1000000 11 1000000 12 1000000 13 1000000 13 2000000 13 4000000 13 8000000 13 16000000 13 32000000 14 1000000 14 2000000 14 4000000 14 8000000 14 16000000 14 32000000 15 1000000 15 2000000 15 4000000 15 8000000 15 16000000 15 32000000 16 1000000 17 1000000 18 1000000 19 1000000 20 1000000 20 2000000 21 1000000 21 2000000 21 4000000 21 8000000 21 16000000 22 1000000 22 2000000 22 4000000 22 8000000 22 16000000 22 32000000 X 1000000 Y 1000000 Y 2000000 Y 4000000 MT 1000000 GL000226.1 1000000 GL000229.1 1000000 GL000239.1 1000000 GL000245.1 1000000 GL000197.1 1000000 GL000246.1 1000000 GL000232.1 1000000 GL000237.1 1000000 GL000204.1 1000000 GL000198.1 1000000 GL000208.1 1000000 GL000228.1 1000000 GL000214.1 1000000 GL000221.1 1000000 GL000218.1 1000000 GL000220.1 1000000 GL000199.1 1000000 GL000217.1 1000000 GL000216.1 1000000 GL000205.1 1000000 GL000219.1 1000000 GL000224.1 1000000 GL000195.1 1000000 GL000212.1 1000000 GL000222.1 1000000 GL000193.1 1000000 GL000225.1 1000000 GL000192.1 1000000 [INFO] [TIME] Lumpy: 1min 01sec ```
At the end of log, LUMPY output something to stderr:
Questions: (1). Is this output ok? or somethings not expected occurred that I should pay attention to. (2). Then, I found all REF in VCF are 'N'. Is that also the expected output? (3). I want to plot ROC using
SU
in FORMAT as threshold, Is that reasonable? It seems to be the same thing in paper.Many thanks!