epifluidlab / FinaleMe

MIT License
9 stars 1 forks source link

Exception in thread "main" java.lang.Exception: Sequence [11] was not found in 2bit file #2

Closed liujilei156231 closed 6 months ago

liujilei156231 commented 7 months ago

Hello, When running the first step of the demo, an error occurred as follows:

(base) liu@DESKTOP-7Q0S5N0:~/FinaleMe-main$ java -Xmx20G -cp "target/FinaleMe-0.58-jar-with-dependencies.jar:lib/gatk-package-distribution-3.3.jar:lib/sis-jhdf5-batteries_included.jar:lib/java-genomics-io.jar:lib/igv.jar" org.cchmc.epifluidlab.finaleme.utils.CpgMultiMetricsStats ./data/hg19.2bit ./data/CG_motif.hg19.common_chr.pos_only.bedgraph ./data/CG_motif.hg19.common_chr.pos_only.bedgraph ./data/bam/BH01.chr22.bam  CpgMultiMetricsStats.hg19.details.bed.gz -stringentPaired -excludeRegions ./data/wgEncodeDukeMapabilityRegionsExcludable_wgEncodeDacMapabilityConsensusExcludable.hg19.bed -valueWigs methyPrior:0:./data/wgbs_buffyCoat_jensen2015GB.methy.hg19.bw -wgsMode
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/liu/FinaleMe-main/target/FinaleMe-0.58-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/liu/FinaleMe-main/lib/gatk-package-distribution-3.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
INFO [2024-02-01 16:57:54,646]  [CpgMultiMetricsStats.java:192] [main]  Processing interval file ...
INFO [2024-02-01 16:57:54,678]  [CpgMultiMetricsStats.java:199] [main]  Excluding intervals ...
INFO [2024-02-01 16:57:54,698]  [CpgMultiMetricsStats.java:407] [main]  Loading value interval big wig file ...
INFO [2024-02-01 16:57:54,770]  [CpgMultiMetricsStats.java:437] [main]  Automate generate all k-mer until length 0
INFO [2024-02-01 16:57:54,770]  [CpgMultiMetricsStats.java:482] [main]  Loading CpG interval file ...
INFO [2024-02-01 16:57:54,845]  [CpgMultiMetricsStats.java:536] [main]  Get total reads number used for scaling from bam file...
INFO [2024-02-01 16:58:23,118]  [CpgMultiMetricsStats.java:554] [main]  27006990 reads in total ...
INFO [2024-02-01 16:58:23,118]  [CpgMultiMetricsStats.java:556] [main]  Output value for each CpG in each DNA fragment ...
Exception in thread "main" java.lang.Exception: Sequence [11] was not found in 2bit file
        at org.biojava.nbio.genome.parsers.twobit.TwoBitParser.setCurrentSequence(TwoBitParser.java:131)
        at org.cchmc.epifluidlab.finaleme.utils.CpgMultiMetricsStats.doMain(CpgMultiMetricsStats.java:580)
        at org.cchmc.epifluidlab.finaleme.utils.CpgMultiMetricsStats.main(CpgMultiMetricsStats.java:144)

According to the tutorial, I downloaded wgbs_buffyCoat_jensen2015GB.methy.hg19.bw,wgEncodeDukeMapabilityRegionsExcludable_wgEncodeDacMapabilityConsensusExcludable.hg19.bed,hg19.2bit, hg19.chrom.size, hg19.cpgIslandExt.txt.gz from rom UCSC database and zenodo. And, CG_motif.hg19.common_chr.pos_only.bedgraph was maked by the following R code:

x = read.table("./hg19.cpgIslandExt.txt.gz",sep="\t")
x = x[x$V2 %in% paste0("chr",1:22),]
x$V2 = gsub("chr","",x$V2)
write.table(x[,c(2,3,4)],"cpg_coordinates.bed",row.names=F,col.names=F,quote=F,sep="\t")

system("bedtools makewindows -b cpg_coordinates.bed -w 1 > CG_motif.hg19.common_chr.bed")
system("bedtools genomecov -i CG_motif.hg19.common_chr.bed -g hg19.chrom.size -bg > CG_motif.hg19.common_chr.pos_only.bedgraph")

Is there anything wrong with my process? Could you please provide a CG_motif.bedgraph for the first step in your experiment? Thanks!

dnaase commented 7 months ago

please keep "chr" prefix in your CG_motif.hg19.common_chr.pos_only.bedgraph file

Yaping

liujilei156231 commented 7 months ago

Firstly, Xin nian kuai le to Prof Liu. 😃 😃 🎁 Thanks for your advice, and I will try it later.