Closed Elizabeth-mqz-gmz closed 7 months ago
Hello Elizabeth,
We appreciate your use of OKseqHMM.
It appears that there might be a discrepancy between your data sizes and the corresponding chromosome sizes. To address this, we recommend re-indexing your BAM file and rerunning the program (Since you have a warning about the index file being older). If this does not resolve the issue, you may also consider re-aligning your data.
Please let us know if these steps resolve the issue or if you require further assistance.
Hello!
Thank you for your quick response, and I apologize for my delayed reply.
After several attempts, I was able to identify the root cause of the issue. It turns out that I was using the wrong genome version from the chrom.sizes
file. I have corrected this now and the software is running smoothly for the first stage.
Additionally, I followed your suggestion to re-index the BAM file, which resolved the warning message.
To provide context, the data I am working with is from a public source, and I was initially confused by the genome version they were using.
Thank you again for your assistance!
Hello again! I just executed the second stage as follows:
source('../OKseqOEM.R')
OKseqOEM(bamInF = "../hmm_OK-seq_K562_BR1_fwd.bam", bamInR = "../hmm_OK-seq_K562_BR1_rev.bam", chrsizes = "../hg19.chr.size.txt", fileOut ="hmm_OK-seq_K562_BR1_final", binSize=1000, binList=c(1,10,20,50,100,250,500,1000))
Unfortunately, is presenting errors for the alternative chromosome reference _chr6_sstohap7:
[1] "chr6_ssto_hap7"
[1] 4928567
[1] "It's single-end. Calculating 1000bp binsize coverage for forward strand."
[main_samview] region "chr6_ssto_hap7" specifies an invalid region or unknown reference. Continue anyway.
[1] "Calculating 1000bp binsize coverage for reverse strand."
[main_samview] region "chr6_ssto_hap7" specifies an invalid region or unknown reference. Continue anyway.
Error in read.table(fileInF, header = F, comment.char = "", colClasses = c("integer", : no lines available in input
In addition: There were 50 or more warnings (use warnings() to see the first 50)
The regular chromosomes are not presenting any issue, but I would like to ask whether should I take any action on this.
This also happened while testing the data as indicated in the readme file from your templates folder. I supposed this happened because the demo data only provided chromosomes 21 & 22. But I just wanted to point it out in case is important.
[1] "chr1"
[1] 249250621
[1] "It's pair-end. Calculating 1000bp binsize coverage for forward strand."
[1] "Calculating 1000bp binsize coverage for reverse strand."
Error in read.table(fileInF, header = F, comment.char = "", colClasses = c("integer", : no lines available in input
Thanks!
Hi Elizabeth,
Happy that you figured it out. For OKseqOEM.R could you please remove manually the alternative chromosomes from the "hg19.chr.size.txt" (or any chromosome that is unmapped) it can't handle them automatically for now.
Best
Hello!
This worked perfectly, thank you very much! :)
Best wishes
Hello! I am using the program to run it for OK-seq data being single-end and hg38. I obtained the reference for chromosome sizes in hg38 from UCSC, and I am using the default parameters as listed in the running example.
Unfortunately, I am facing this issue with the 'hist.default' function, and this error comes repeatedly. I would greatly appreciate any advice you could give me for this, probably I am setting one of the parameters in the wrong way.
Thanks in advance!