Closed GoogleCodeExporter closed 8 years ago
Original comment by zack...@gmail.com
on 12 Sep 2011 at 6:41
Where can I get a working bin for this, and how do I run it?
Original comment by zack...@gmail.com
on 19 Sep 2011 at 9:52
Hi, zack, you can download the binary jar file and User Manual from this link:
http://epigenome.usc.edu/publicationdata/bissnp2011/
and the source code is available here:
https://uecgatk.svn.sourceforge.net/svnroot/uecgatk/
trunk/src/edu/usc/epigenome/uecgatk/bisulfitesnpmodel
Original comment by lyping1...@gmail.com
on 19 Sep 2011 at 10:41
great!
I was going to download it and work with you on getting a good default set of
arguments but hpcc is down, so none of the files are available.
Original comment by zack...@gmail.com
on 19 Sep 2011 at 10:47
I have it in the pipeline, but i have hardcoded PE "-pem" into the call, so it
fails on SR bams. is it possible to make it determine PE/SR by itself?
Original comment by zack...@gmail.com
on 16 Feb 2012 at 12:12
yes, I think so. I will change it to recognize PE/SE.Thanks
Original comment by lyping1...@gmail.com
on 16 Feb 2012 at 12:20
added to production pipeline, lets see how it does over the next few runs.
Original comment by zack...@gmail.com
on 4 Apr 2012 at 7:12
Original comment by zack...@gmail.com
on 4 Apr 2012 at 7:12
Can we run this on merged lanes? The most important are the TCGA merges. But
NOMe-seq libraries will sometimes be multiple lanes as well.
The approved TCGA output for Bis-SNP is still being determined. But I think we
should generate the following "modified wiggle" format for the time being (CpG
only).
track type=wiggle_0 name=sample alwaysZero=off
variableStep chrom=chr1
469 25 4
480 80 10
800 88 8
Where the first two fields are standard wiggle (field1=genomic
coordinate, field2=percent methylated), and field3 is the total number
of (C or T) reads covering the position.
Original comment by benb...@gmail.com
on 4 Apr 2012 at 7:21
For merged lanes, you could specify -I lane_1.bam -I lane_2.bam.... multiple
times. But if different lanes bam file own different Readgroup name, then
BisSNP right now would call them separately as different samples. How is the
Read group name signed to bam file, by lane name or sample name?
For "modified wiggle" format, I already made a perl script to convert from VCF
file generated by BisSNP.
Original comment by lyping1...@gmail.com
on 4 Apr 2012 at 8:58
Here is my perl scripts to convert from VCF to wiggle or modified wiggle file:
http://epigenome.usc.edu/publicationdata/bissnp2011/vcf2cpg_wig_file.pl
http://epigenome.usc.edu/publicationdata/bissnp2011/vcf2cpg_wig_plus_file.pl
Original comment by lyping1...@gmail.com
on 4 Apr 2012 at 10:19
our latest attempt didnt seem to work, heres the errors:
/home/uec-00/shared/production/software/java/default/bin/java -Xmx20G -jar
/home/uec-00/shared/production/software/bissnp/bissnp-default.jar -aecm -R
/home/uec-00/shared/production/genomes/hg19_rCRSchrm/hg19_rCRSchrm.fa -T
BisulfiteGenotyper -I ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam
-D /home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf -vfn1
ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam.cpg.raw.vcf -vfn2
ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam.snp.raw.vcf
-stand_call_conf 30 -stand_emit_conf 0 -L
/home/uec-00/shared/production/software/bissnp/wholegenome_interval_list.hg19.be
d -out_modes DEFAULT_FOR_TCGA -single_sample normal_test -nt 5 -rgv hg19 -mbq 0
-mmq 30
##### ERROR
--------------------------------------------------------------------------------
----------
##### ERROR stack trace
org.broadinstitute.sting.utils.exceptions.ReviewedStingException: An error
occurred during the traversal.
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.getTraversalError(HierarchicalMicroScheduler.java:356)
at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:105)
at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:246)
at edu.usc.epigenome.uecgatk.BisSNP.BisSNP.execute(BisSNP.java:275)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:236)
at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:146)
at edu.usc.epigenome.uecgatk.BisSNP.BisSNP.main(BisSNP.java:134)
Caused by: java.lang.IllegalArgumentException: Duplicate allele added to
VariantContext: A
at org.broadinstitute.sting.utils.variantcontext.VariantContext.makeAlleles(VariantContext.java:1182)
at org.broadinstitute.sting.utils.variantcontext.VariantContext.<init>(VariantContext.java:277)
at org.broadinstitute.sting.utils.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:392)
at edu.usc.epigenome.uecgatk.BisSNP.BisulfiteGenotyperEngine.createVariantContextFromLikelihoods(BisulfiteGenotyperEngine.java:574)
at edu.usc.epigenome.uecgatk.BisSNP.BisulfiteGenotyperEngine.calculateLikelihoods(BisulfiteGenotyperEngine.java:175)
at edu.usc.epigenome.uecgatk.BisSNP.BisulfiteGenotyperEngine.calculateLikelihoodsAndGenotypes(BisulfiteGenotyperEngine.java:138)
at edu.usc.epigenome.uecgatk.BisSNP.BisulfiteGenotyperEngine.<init>(BisulfiteGenotyperEngine.java:104)
at edu.usc.epigenome.uecgatk.BisSNP.BisulfiteGenotyper.map(BisulfiteGenotyper.java:328)
at edu.usc.epigenome.uecgatk.BisSNP.BisulfiteGenotyper.map(BisulfiteGenotyper.java:1)
at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:78)
at org.broadinstitute.sting.gatk.traversals.TraverseLoci.traverse(TraverseLoci.java:18)
at org.broadinstitute.sting.gatk.executive.ShardTraverser.call(ShardTraverser.java:72)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
##### ERROR
--------------------------------------------------------------------------------
----------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.5-3-gbb2c10b):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our wiki for extensive documentation
http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions
http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: An error occurred during the traversal.
##### ERROR
--------------------------------------------------------------------------------
----------
mv: cannot stat
`ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam.cpg.raw.vcf': No such
file or directory
mv: cannot stat
`ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam.snp.raw.vcf': No such
file or directory
----------------------------------------
Begin PBS Prologue Fri Apr 6 22:46:02 PDT 2012
Job ID: 1383711.hpc-pbs.usc.edu
Username: ramjan
Group: hsc-ar
Name:
uec_C02APACXX_C02APACXX_5_NIC1254A13_uscec_bissnp663669832097810418.sh
Queue: laird_exe
Shared Access: yes
Nodes: hpc2701
TMPDIR: /tmp/1383711.hpc-pbs.usc.edu
End PBS Prologue Fri Apr 6 22:46:02 PDT 2012
----------------------------------------
INFO 22:46:06,397 RodBindingArgumentTypeDescriptor - Dynamically determined
type of
/home/uec-00/shared/production/software/bissnp/wholegenome_interval_list.hg19.be
d to be BED
INFO 22:46:06,437 HelpFormatter -
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------------------------------
INFO 22:46:06,437 HelpFormatter - The Bis-SNP-0.54, Compiled 2012/03/14
15:25:14
INFO 22:46:06,438 HelpFormatter - Based on The Genome Analysis Toolkit (GATK)
v1.5-3-gbb2c10b (prebuild GATK package could be download here:
ftp://ftp.broadinstitute.org/pub/gsa/GenomeAnalysisTK/GenomeAnalysisTK-1.5-3-gbb
2c10b.tar.bz2)
INFO 22:46:06,438 HelpFormatter - Copyright (c) 2011 USC Epigenome Center
INFO 22:46:06,438 HelpFormatter - Please view our documentation at
http://epigenome.usc.edu/publicationdata/bissnp2011/
INFO 22:46:06,438 HelpFormatter - For support, please send email to
lyping1986@gmail.com or benbfly@gmail.com
INFO 22:46:06,439 HelpFormatter - Program Args: -aecm -R
/home/uec-00/shared/production/genomes/hg19_rCRSchrm/hg19_rCRSchrm.fa -T
BisulfiteGenotyper -I ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam
-D /home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf -vfn1
ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam.cpg.raw.vcf -vfn2
ResultCount_C02APACXX_5_NIC1254A13.hg19_rCRSchrm.fa.bam.snp.raw.vcf
-stand_call_conf 30 -stand_emit_conf 0 -L
/home/uec-00/shared/production/software/bissnp/wholegenome_interval_list.hg19.be
d -out_modes DEFAULT_FOR_TCGA -single_sample normal_test -nt 5 -rgv hg19 -mbq 0
-mmq 30
INFO 22:46:06,439 HelpFormatter - Date/Time: 2012/04/06 22:46:06
INFO 22:46:06,439 HelpFormatter -
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------------------------------
INFO 22:46:06,439 HelpFormatter -
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-----------------------------------------
INFO 22:46:06,494 RodBindingArgumentTypeDescriptor - Dynamically determined
type of /home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf
to be VCF
INFO 22:46:06,540 GenomeAnalysisEngine - Strictness is SILENT
INFO 22:46:06,637 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 22:46:06,693 SAMDataSource$SAMReaders - Done initializing BAM readers:
total time 0.06
INFO 22:46:06,745 RMDTrackBuilder - Loading Tribble index from disk for file
/home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf
INFO 22:46:06,994 MicroScheduler - Running the GATK in parallel mode with 5
concurrent threads
sample name provided was masked by bam file header
INFO 22:46:07,738 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 22:46:07,748 SAMDataSource$SAMReaders - Done initializing BAM readers:
total time 0.01
INFO 22:46:07,748 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 22:46:07,756 SAMDataSource$SAMReaders - Done initializing BAM readers:
total time 0.01
INFO 22:46:07,756 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 22:46:07,762 RMDTrackBuilder - Loading Tribble index from disk for file
/home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf
INFO 22:46:07,766 SAMDataSource$SAMReaders - Done initializing BAM readers:
total time 0.01
INFO 22:46:07,768 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 22:46:07,775 SAMDataSource$SAMReaders - Done initializing BAM readers:
total time 0.01
INFO 22:46:07,900 RMDTrackBuilder - Loading Tribble index from disk for file
/home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf
INFO 22:46:08,042 RMDTrackBuilder - Loading Tribble index from disk for file
/home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf
INFO 22:46:08,080 TraversalEngine - [INITIALIZATION COMPLETE; TRAVERSAL
STARTING]
INFO 22:46:08,081 TraversalEngine - Location processed.sites runtime
per.1M.sites completed total.runtime remaining
INFO 22:46:08,216 RMDTrackBuilder - Loading Tribble index from disk for file
/home/uec-00/shared/production/software/bissnp/dbsnp_135.hg19.sort.vcf
INFO 22:46:37,743 TraversalEngine - chr1:2710500 2.64e+06 30.0 s
11.3 s 0.1% 9.5 h 9.5 h
INFO 22:47:07,750 TraversalEngine - chr1:5585622 5.59e+06 60.0 s
10.7 s 0.2% 9.2 h 9.2 h
INFO 22:47:37,755 TraversalEngine - chr1:8247405 8.18e+06 90.0 s
11.0 s 0.3% 9.4 h 9.4 h
INFO 22:48:07,769 TraversalEngine - chr1:10621431 1.06e+07 2.0 m
11.3 s 0.3% 9.7 h 9.7 h
INFO 22:48:39,245 TraversalEngine - chr1:13150138 1.32e+07 2.5 m
11.5 s 0.4% 9.9 h 9.9 h
INFO 22:49:09,249 TraversalEngine - chr1:15975977 1.59e+07 3.0 m
11.4 s 0.5% 9.8 h 9.7 h
INFO 22:49:39,250 TraversalEngine - chr1:18643493 1.86e+07 3.5 m
11.3 s 0.6% 9.8 h 9.7 h
INFO 22:50:09,314 TraversalEngine - chr1:21432550 2.14e+07 4.0 m
11.3 s 0.7% 9.7 h 9.6 h
INFO 22:50:39,319 TraversalEngine - chr1:24130765 2.41e+07 4.5 m
11.3 s 0.8% 9.7 h 9.6 h
INFO 22:51:09,329 TraversalEngine - chr1:26944835 2.69e+07 5.0 m
11.2 s 0.9% 9.6 h 9.5 h
INFO 22:51:39,338 TraversalEngine - chr1:29504333 2.95e+07 5.5 m
11.3 s 1.0% 9.7 h 9.6 h
INFO 22:52:09,344 TraversalEngine - chr1:32328057 3.23e+07 6.0 m
11.2 s 1.0% 9.6 h 9.5 h
INFO 22:52:39,506 TraversalEngine - chr1:34237024 3.42e+07 6.5 m
11.5 s 1.1% 9.8 h 9.7 h
INFO 22:53:09,514 TraversalEngine - chr1:36027788 3.60e+07 7.0 m
11.7 s 1.2% 10.1 h 9.9 h
INFO 22:53:39,519 TraversalEngine - chr1:38001595 3.80e+07 7.5 m
11.9 s 1.2% 10.2 h 10.1 h
INFO 22:54:09,529 TraversalEngine - chr1:39818841 3.98e+07 8.0 m
12.1 s 1.3% 10.4 h 10.3 h
INFO 22:54:39,531 TraversalEngine - chr1:41606632 4.16e+07 8.5 m
12.3 s 1.3% 10.6 h 10.4 h
INFO 22:55:09,534 TraversalEngine - chr1:43542858 4.35e+07 9.0 m
12.5 s 1.4% 10.7 h 10.5 h
INFO 22:55:39,539 TraversalEngine - chr1:45354667 4.53e+07 9.5 m
12.6 s 1.5% 10.8 h 10.7 h
INFO 22:56:09,546 TraversalEngine - chr1:47229869 4.72e+07 10.0 m
12.8 s 1.5% 11.0 h 10.8 h
INFO 22:56:39,551 TraversalEngine - chr1:49870264 4.99e+07 10.5 m
12.7 s 1.6% 10.9 h 10.7 h
INFO 22:57:09,558 TraversalEngine - chr1:52484622 5.25e+07 11.0 m
12.6 s 1.7% 10.8 h 10.7 h
INFO 22:57:39,559 TraversalEngine - chr1:55214423 5.51e+07 11.5 m
12.5 s 1.8% 10.8 h 10.6 h
INFO 22:58:09,565 TraversalEngine - chr1:57816667 5.78e+07 12.0 m
12.5 s 1.9% 10.7 h 10.5 h
INFO 22:58:39,573 TraversalEngine - chr1:60454929 6.04e+07 12.5 m
12.4 s 2.0% 10.7 h 10.5 h
INFO 22:59:09,585 TraversalEngine - chr1:62907695 6.29e+07 13.0 m
12.4 s 2.0% 10.7 h 10.5 h
INFO 22:59:39,592 TraversalEngine - chr1:65542888 6.55e+07 13.5 m
12.4 s 2.1% 10.7 h 10.4 h
INFO 23:00:09,596 TraversalEngine - chr1:68135480 6.81e+07 14.0 m
12.4 s 2.2% 10.6 h 10.4 h
INFO 23:00:39,605 TraversalEngine - chr1:70721697 7.07e+07 14.5 m
12.3 s 2.3% 10.6 h 10.4 h
INFO 23:01:09,612 TraversalEngine - chr1:73358824 7.33e+07 15.0 m
12.3 s 2.4% 10.6 h 10.3 h
INFO 23:01:39,620 TraversalEngine - chr1:75927316 7.59e+07 15.5 m
12.3 s 2.5% 10.6 h 10.3 h
INFO 23:02:09,622 TraversalEngine - chr1:78185855 7.81e+07 16.0 m
12.3 s 2.5% 10.6 h 10.3 h
INFO 23:02:39,633 TraversalEngine - chr1:80810012 8.07e+07 16.5 m
12.3 s 2.6% 10.6 h 10.3 h
INFO 23:03:09,639 TraversalEngine - chr1:83403010 8.34e+07 17.0 m
12.3 s 2.7% 10.5 h 10.3 h
INFO 23:03:39,641 TraversalEngine - chr1:86048398 8.60e+07 17.5 m
12.2 s 2.8% 10.5 h 10.2 h
INFO 23:04:09,649 TraversalEngine - chr1:88633678 8.86e+07 18.0 m
12.2 s 2.9% 10.5 h 10.2 h
INFO 23:04:39,657 TraversalEngine - chr1:91278075 9.12e+07 18.5 m
12.2 s 2.9% 10.5 h 10.2 h
INFO 23:05:09,660 TraversalEngine - chr1:93901066 9.38e+07 19.0 m
12.2 s 3.0% 10.5 h 10.1 h
INFO 23:05:39,668 TraversalEngine - chr1:96466450 9.64e+07 19.5 m
12.2 s 3.1% 10.4 h 10.1 h
INFO 23:06:09,675 TraversalEngine - chr1:98925683 9.89e+07 20.0 m
12.1 s 3.2% 10.4 h 10.1 h
INFO 23:06:39,685 TraversalEngine - chr1:101521303 1.01e+08 20.5 m
12.1 s 3.3% 10.4 h 10.1 h
INFO 23:07:09,687 TraversalEngine - chr1:104369450 1.04e+08 21.0 m
12.1 s 3.4% 10.4 h 10.0 h
INFO 23:07:39,692 TraversalEngine - chr1:107070337 1.07e+08 21.5 m
12.1 s 3.5% 10.4 h 10.0 h
INFO 23:08:09,695 TraversalEngine - chr1:109751064 1.10e+08 22.0 m
12.0 s 3.5% 10.4 h 10.0 h
INFO 23:08:39,702 TraversalEngine - chr1:112503552 1.13e+08 22.5 m
12.0 s 3.6% 10.3 h 10.0 h
INFO 23:09:09,711 TraversalEngine - chr1:114993168 1.15e+08 23.0 m
12.0 s 3.7% 10.3 h 10.0 h
INFO 23:09:39,716 TraversalEngine - chr1:117709031 1.18e+08 23.5 m
12.0 s 3.8% 10.3 h 9.9 h
INFO 23:10:09,723 TraversalEngine - chr1:120322387 1.20e+08 24.0 m
12.0 s 3.9% 10.3 h 9.9 h
----------------------------------------
Begin PBS Epilogue Fri Apr 6 23:10:28 PDT 2012
Job ID: 1383711.hpc-pbs.usc.edu
Username: ramjan
Group: hsc-ar
Job Name:
uec_C02APACXX_C02APACXX_5_NIC1254A13_uscec_bissnp663669832097810418.sh
Session: 15789
Limits:
mem=15000mb,neednodes=1:ppn=12:hexcore,nodes=1:ppn=12:hexcore,walltime=200:00:00
Resources: cput=02:06:05,mem=8719356kb,vmem=21805572kb,walltime=00:24:26
Queue: laird_exe
End PBS Epilogue Fri Apr 6 23:10:28 PDT 2012
----------------------------------------
Original comment by zack...@gmail.com
on 8 Apr 2012 at 5:37
I have not met such an error in the previous test..could u give me the bam file location? so that i could test it again by myself. Thanks.
Original comment by lyping1...@gmail.com
on 10 Apr 2012 at 12:09
And just a reminder, "-single_sample normal_test" should not be specified in
the latest version. you could use -nt 12 (8g-10g in total), rather than 3g per
thread.
Original comment by lyping1...@gmail.com
on 10 Apr 2012 at 4:27
bissnp has been a working part of our bis pipeline for a while now. closing
this ticket
Original comment by zack...@gmail.com
on 30 Jul 2012 at 8:30
I am encountering this same error using the latest version of bssnp. It looks
like this bug was fixed in one of the later versions of GATK, but bissnp is
still build upon an older one.
Original comment by zynd...@gmail.com
on 3 Dec 2012 at 8:17
Original comment by zack...@gmail.com
on 3 Dec 2012 at 8:23
"Incorporate Bis-SNP into bisulfite-sequencing pipeline" has been accomplished
for some time, closing
Original comment by zack...@gmail.com
on 3 Jan 2014 at 10:14
Original issue reported on code.google.com by
benb...@gmail.com
on 12 Sep 2011 at 4:05