Open El-Castor opened 4 years ago
Hi,
Can you help me? I have test with the data test and it doesn't work...
Hi, thanks for the extensive debug log. The seqlevels seem to be different between the objects "bismarck.data" and "cytosine.positions". Can you also supply printed output for the "cytosine.positions" object? How are you generating this? Best, Aaron
Hi, I'm using MethylStar pipeline that used bismark to mapped the reads and produce ce cs_report.txt file too. I have check the length of the seqnames of the two R object and you right they are differents. but i dontunderstand why.
> length(bismark.data@seqnames)
[1] 121817270
> length(cytosine.positions@seqnames)
[1] 121817216
I have no warning when I read the cs_report.txt file.
here you have a print of the cs_report.txt from bismark:
CMiso1.1chr03 1 + 0 0 CHH CCC
CMiso1.1chr03 2 + 0 0 CHH CCA
CMiso1.1chr03 3 + 0 0 CHH CAA
CMiso1.1chr03 7 + 0 0 CHH CCC
CMiso1.1chr03 8 + 0 0 CHH CCT
CMiso1.1chr03 9 + 0 0 CHH CTA
CMiso1.1chr03 14 + 0 0 CHH CCC
CMiso1.1chr03 15 + 0 0 CHG CCG
CMiso1.1chr03 16 + 0 0 CG CGA
CMiso1.1chr03 17 - 0 0 CG CGG
CMiso1.1chr03 21 + 0 0 CHH CCC
CMiso1.1chr03 22 + 0 0 CHH CCT
CMiso1.1chr03 23 + 0 0 CHH CTA
CMiso1.1chr03 28 + 0 0 CHH CCC
CMiso1.1chr03 29 + 0 0 CHG CCG
CMiso1.1chr03 30 + 0 0 CG CGA
CMiso1.1chr03 31 - 0 0 CG CGG
CMiso1.1chr03 35 + 0 0 CHH CCC
CMiso1.1chr03 36 + 0 0 CHH CCT
CMiso1.1chr03 37 + 0 0 CHH CTA
CMiso1.1chr03 42 + 0 0 CHH CCC
CMiso1.1chr03 43 + 0 0 CHH CCC
CMiso1.1chr03 44 + 0 0 CHH CCA
CMiso1.1chr03 45 + 0 0 CHH CAA
CMiso1.1chr03 49 + 0 0 CHH CCC
CMiso1.1chr03 50 + 0 0 CHH CCT
CMiso1.1chr03 51 + 0 0 CHH CTA
CMiso1.1chr03 56 + 0 0 CHH CCC
CMiso1.1chr03 57 + 0 0 CHH CCC
CMiso1.1chr03 58 + 0 0 CHH CCA
CMiso1.1chr03 59 + 0 0 CHH CAA
CMiso1.1chr03 63 + 0 0 CHH CCC
CMiso1.1chr03 64 + 0 0 CHH CCT
CMiso1.1chr03 65 + 0 0 CHH CTA
CMiso1.1chr03 70 + 0 0 CG CGC
CMiso1.1chr03 71 - 0 0 CG CGT
CMiso1.1chr03 72 + 0 0 CHH CTA
CMiso1.1chr03 77 + 0 1 CHH CCC
CMiso1.1chr03 78 + 1 0 CHG CCG
CMiso1.1chr03 79 + 1 0 CG CGA
CMiso1.1chr03 80 - 0 0 CG CGG
CMiso1.1chr03 83 + 0 1 CHH CCC
CMiso1.1chr03 84 + 0 1 CHH CCC
CMiso1.1chr03 85 + 1 0 CHG CCG
CMiso1.1chr03 86 + 1 0 CG CGA
CMiso1.1chr03 87 - 0 0 CG CGG
CMiso1.1chr03 90 + 0 1 CHH CCC
CMiso1.1chr03 91 + 0 1 CHH CCC
CMiso1.1chr03 92 + 1 0 CHG CCG
CMiso1.1chr03 93 + 1 0 CG CGA
CMiso1.1chr03 94 - 0 0 CG CGG
CMiso1.1chr03 97 + 0 1 CHH CCC
CMiso1.1chr03 98 + 0 1 CHH CCT
CMiso1.1chr03 99 + 1 0 CHH CTA
CMiso1.1chr03 104 + 0 1 CHH CCC
CMiso1.1chr03 105 + 0 1 CHH CCT
CMiso1.1chr03 106 + 1 0 CHH CTA
CMiso1.1chr03 111 + 0 1 CHH CCC
CMiso1.1chr03 112 + 1 0 CHG CCG
CMiso1.1chr03 113 + 1 0 CG CGA
CMiso1.1chr03 114 - 0 0 CG CGG
CMiso1.1chr03 118 + 0 1 CHH CCC
CMiso1.1chr03 119 + 0 1 CHH CCT
CMiso1.1chr03 120 + 1 0 CHH CTA
CMiso1.1chr03 125 + 0 1 CHH CCC
CMiso1.1chr03 126 + 1 0 CHG CCG
CMiso1.1chr03 127 + 1 0 CG CGA
CMiso1.1chr03 128 - 0 0 CG CGG
CMiso1.1chr03 132 + 0 0 CHH CCC
CMiso1.1chr03 133 + 0 0 CHG CCG
CMiso1.1chr03 134 + 0 0 CG CGA
CMiso1.1chr03 135 - 0 0 CG CGG
CMiso1.1chr03 138 + 0 0 CHH CCC
CMiso1.1chr03 139 + 0 0 CHH CCC
CMiso1.1chr03 140 + 0 0 CHG CCG
CMiso1.1chr03 141 + 0 0 CG CGA
CMiso1.1chr03 142 - 0 0 CG CGG
CMiso1.1chr03 145 + 0 0 CHH CCC
CMiso1.1chr03 146 + 0 0 CHG CCG
CMiso1.1chr03 147 + 0 0 CG CGA
CMiso1.1chr03 148 - 0 0 CG CGG
CMiso1.1chr03 151 + 0 0 CHH CCC
CMiso1.1chr03 152 + 0 0 CHH CCC
CMiso1.1chr03 153 + 0 0 CHG CCG
CMiso1.1chr03 154 + 0 0 CG CGA
CMiso1.1chr03 155 - 0 0 CG CGG
CMiso1.1chr03 158 + 0 0 CHH CCC
CMiso1.1chr03 159 + 0 0 CHG CCG
CMiso1.1chr03 160 + 0 0 CG CGG
CMiso1.1chr03 161 - 0 0 CG CGG
CMiso1.1chr03 162 - 0 0 CHG CCG
CMiso1.1chr03 164 + 0 0 CHH CCC
CMiso1.1chr03 165 + 0 0 CHH CCC
CMiso1.1chr03 166 + 0 0 CHG CCG
CMiso1.1chr03 167 + 0 0 CG CGA
CMiso1.1chr03 168 - 0 0 CG CGG
CMiso1.1chr03 171 + 0 0 CHH CCC
CMiso1.1chr03 172 + 0 0 CHH CCC
CMiso1.1chr03 173 + 0 0 CHG CCG
CMiso1.1chr03 174 + 0 0 CG CGA
$ tail -100 Mock_FDLM202341331-1a.CX_report.txt
CMiso1.1chr12 26619936 + 2 0 CG CGG
CMiso1.1chr12 26619937 - 1 0 CG CGA
CMiso1.1chr12 26619938 - 1 0 CHG CCG
CMiso1.1chr12 26619939 - 0 1 CHH CCC
CMiso1.1chr12 26619943 + 2 0 CG CGG
CMiso1.1chr12 26619944 - 1 0 CG CGA
CMiso1.1chr12 26619945 - 1 0 CHG CCG
CMiso1.1chr12 26619946 - 0 1 CHH CCC
CMiso1.1chr12 26619950 + 2 0 CG CGG
CMiso1.1chr12 26619951 - 1 0 CG CGA
CMiso1.1chr12 26619952 - 1 0 CHG CCG
CMiso1.1chr12 26619953 - 0 1 CHH CCC
CMiso1.1chr12 26619957 + 2 0 CG CGG
CMiso1.1chr12 26619958 - 1 0 CG CGA
CMiso1.1chr12 26619959 - 1 0 CHG CCG
CMiso1.1chr12 26619960 - 0 1 CHH CCC
CMiso1.1chr12 26619964 + 2 0 CG CGG
CMiso1.1chr12 26619965 - 1 0 CG CGA
CMiso1.1chr12 26619966 - 1 0 CHG CCG
CMiso1.1chr12 26619967 - 0 1 CHH CCC
CMiso1.1chr12 26619971 + 2 0 CG CGG
CMiso1.1chr12 26619972 - 1 0 CG CGA
CMiso1.1chr12 26619973 - 1 0 CHG CCG
CMiso1.1chr12 26619974 - 0 1 CHH CCC
CMiso1.1chr12 26619978 + 2 0 CG CGG
CMiso1.1chr12 26619979 - 1 0 CG CGA
CMiso1.1chr12 26619980 - 1 0 CHG CCG
CMiso1.1chr12 26619981 - 0 1 CHH CCC
CMiso1.1chr12 26619985 + 1 1 CG CGG
CMiso1.1chr12 26619986 - 1 0 CG CGA
CMiso1.1chr12 26619987 - 1 0 CHG CCG
CMiso1.1chr12 26619988 - 0 1 CHH CCC
CMiso1.1chr12 26619992 + 1 1 CG CGG
CMiso1.1chr12 26619993 - 1 0 CG CGA
CMiso1.1chr12 26619994 - 1 0 CHG CCG
CMiso1.1chr12 26619995 - 0 1 CHH CCC
CMiso1.1chr12 26619999 + 1 1 CG CGG
CMiso1.1chr12 26620000 - 1 0 CG CGA
CMiso1.1chr12 26620001 - 1 0 CHG CCG
CMiso1.1chr12 26620002 - 0 1 CHH CCC
CMiso1.1chr12 26620006 + 2 0 CG CGG
CMiso1.1chr12 26620007 - 1 0 CG CGA
CMiso1.1chr12 26620008 - 1 0 CHG CCG
CMiso1.1chr12 26620009 - 0 1 CHH CCC
CMiso1.1chr12 26620013 + 2 0 CG CGG
CMiso1.1chr12 26620014 - 1 0 CG CGA
CMiso1.1chr12 26620015 - 1 0 CHG CCG
CMiso1.1chr12 26620016 - 0 1 CHH CCC
CMiso1.1chr12 26620020 + 2 0 CG CGG
CMiso1.1chr12 26620021 - 1 0 CG CGA
CMiso1.1chr12 26620022 - 1 0 CHG CCG
CMiso1.1chr12 26620023 - 0 1 CHH CCC
CMiso1.1chr12 26620027 + 2 0 CG CGG
CMiso1.1chr12 26620028 - 1 0 CG CGA
CMiso1.1chr12 26620029 - 1 0 CHG CCG
CMiso1.1chr12 26620030 - 0 1 CHH CCC
CMiso1.1chr12 26620034 + 2 0 CG CGG
CMiso1.1chr12 26620035 - 1 0 CG CGA
CMiso1.1chr12 26620036 - 1 0 CHG CCG
CMiso1.1chr12 26620037 - 0 1 CHH CCC
CMiso1.1chr12 26620041 + 2 0 CG CGG
CMiso1.1chr12 26620042 - 1 0 CG CGA
CMiso1.1chr12 26620043 - 1 0 CHG CCG
CMiso1.1chr12 26620044 - 0 1 CHH CCC
CMiso1.1chr12 26620048 + 2 0 CG CGG
CMiso1.1chr12 26620049 - 1 0 CG CGA
CMiso1.1chr12 26620050 - 1 0 CHG CCG
CMiso1.1chr12 26620051 - 0 1 CHH CCC
CMiso1.1chr12 26620055 + 2 0 CG CGG
CMiso1.1chr12 26620056 - 1 0 CG CGA
CMiso1.1chr12 26620057 - 1 0 CHG CCG
CMiso1.1chr12 26620058 - 0 1 CHH CCC
CMiso1.1chr12 26620062 + 2 0 CG CGG
CMiso1.1chr12 26620063 - 1 0 CG CGA
CMiso1.1chr12 26620064 - 0 1 CHG CCG
CMiso1.1chr12 26620065 - 0 1 CHH CCC
CMiso1.1chr12 26620069 + 2 0 CG CGG
CMiso1.1chr12 26620070 - 0 0 CG CGA
CMiso1.1chr12 26620071 - 0 0 CHG CCG
CMiso1.1chr12 26620072 - 0 0 CHH CCC
CMiso1.1chr12 26620077 + 2 0 CG CGG
CMiso1.1chr12 26620078 - 0 0 CG CGA
CMiso1.1chr12 26620079 - 0 0 CHG CCG
CMiso1.1chr12 26620080 - 0 0 CHH CCC
CMiso1.1chr12 26620084 + 0 0 CG CGG
CMiso1.1chr12 26620085 - 0 0 CG CGA
CMiso1.1chr12 26620086 - 0 0 CHG CCG
CMiso1.1chr12 26620087 - 0 0 CHH CCC
CMiso1.1chr12 26620091 + 0 0 CG CGG
CMiso1.1chr12 26620092 - 0 0 CG CGA
CMiso1.1chr12 26620093 - 0 0 CHG CCG
CMiso1.1chr12 26620094 - 0 0 CHH CCC
CMiso1.1chr12 26620098 + 0 0 CG CGG
CMiso1.1chr12 26620099 - 0 0 CG CGA
CMiso1.1chr12 26620100 - 0 0 CHG CCG
CMiso1.1chr12 26620101 - 0 0 CHH CCC
CMiso1.1chr12 26620105 + 0 0 CG CGG
CMiso1.1chr12 26620106 - 0 0 CG CGA
CMiso1.1chr12 26620107 - 0 0 CHG CCG
CMiso1.1chr12 26620108 - 0 0 CHH CCC
tell me if you want more information,
Thanks in advance
Can you print the bismarck.data@seqinfo and cytosine.positions@seqinfo? You need to find out where the difference comes from. Both objects should contain the same chromosomes with the same lengths. Cheers, Aaron
Hi @ataudt,
thanks to have a look on my data, below you have the print of the seqInfo. bismarck.data@seqinfo :
> bismark.data@seqinfo
Seqinfo object with 13 sequences from an unspecified genome:
seqnames seqlengths isCircular genome
CMiso1.1chr03 29901275 <NA> <NA>
CMiso1.1chr10 24822432 <NA> <NA>
CMiso1.1chr05 29172381 <NA> <NA>
CMiso1.1chr06 36200319 <NA> <NA>
CMiso1.1chr04 36545425 <NA> <NA>
... ... ... ...
CMiso1.1chr09 23958596 <NA> <NA>
CMiso1.1chr02 24717502 <NA> <NA>
CMiso1.1chr11 33524158 <NA> <NA>
CMiso1.1chr00 3219671 <NA> <NA>
CMiso1.1chr12 26620111 <NA> <NA>
cytosine.positions@seqinfo
> cytosine.positions@seqinfo
Seqinfo object with 13 sequences from an unspecified genome:
seqnames seqlengths isCircular genome
CMiso1.1chr00 len=3219671 3219671 <NA> <NA>
CMiso1.1chr01 len=35819275 35819275 <NA> <NA>
CMiso1.1chr02 len=24717502 24717502 <NA> <NA>
CMiso1.1chr03 len=29901275 29901275 <NA> <NA>
CMiso1.1chr04 len=36545425 36545425 <NA> <NA>
... ... ... ...
CMiso1.1chr08 len=33813569 33813569 <NA> <NA>
CMiso1.1chr09 len=23958596 23958596 <NA> <NA>
CMiso1.1chr10 len=24822432 24822432 <NA> <NA>
CMiso1.1chr11 len=33524158 33524158 <NA> <NA>
CMiso1.1chr12 len=26620111 26620111 <NA> <NA>
>
If I compare the length of the to seqInfo vector of two R object :
> length(bismark.data@seqinfo)
[1] 13
length(cytosine.positions@seqinfo)
[1] 13
Then I have check the chrom length one by when of the two obejct and I don't find any differences, as you can see below:
note that the only differences is the sorting of the seqnames comparing the two R GRanges objects.
seqlengths isCircular genome
CMiso1.1chr03 29901275 NA <NA>
CMiso1.1chr10 24822432 NA <NA>
CMiso1.1chr05 29172381 NA <NA>
CMiso1.1chr06 36200319 NA <NA>
CMiso1.1chr04 36545425 NA <NA>
CMiso1.1chr01 35819275 NA <NA>
CMiso1.1chr07 26767028 NA <NA>
CMiso1.1chr08 33813569 NA <NA>
CMiso1.1chr09 23958596 NA <NA>
CMiso1.1chr02 24717502 NA <NA>
CMiso1.1chr11 33524158 NA <NA>
CMiso1.1chr00 3219671 NA <NA>
CMiso1.1chr12 26620111 NA <NA>
> t2
seqlengths isCircular genome
CMiso1.1chr00 len=3219671 3219671 NA <NA>
CMiso1.1chr01 len=35819275 35819275 NA <NA>
CMiso1.1chr02 len=24717502 24717502 NA <NA>
CMiso1.1chr03 len=29901275 29901275 NA <NA>
CMiso1.1chr04 len=36545425 36545425 NA <NA>
CMiso1.1chr05 len=29172381 29172381 NA <NA>
CMiso1.1chr06 len=36200319 36200319 NA <NA>
CMiso1.1chr07 len=26767028 26767028 NA <NA>
CMiso1.1chr08 len=33813569 33813569 NA <NA>
CMiso1.1chr09 len=23958596 23958596 NA <NA>
CMiso1.1chr10 len=24822432 24822432 NA <NA>
CMiso1.1chr11 len=33524158 33524158 NA <NA>
CMiso1.1chr12 len=26620111 26620111 NA <NA>
Do you have any suggestions ?
best,
Hi, Do you have any suggestion please?
I have change the seqnames of cytosine.positions and I still can do the analysis :
> cytosine.positions
GRanges object with 121817216 ranges and 1 metadata column:
seqnames ranges strand | context
<Rle> <IRanges> <Rle> | <factor>
[1] CMiso1.1chr00 3 + | CHH
[2] CMiso1.1chr00 6 - | CHH
[3] CMiso1.1chr00 10 - | CHH
[4] CMiso1.1chr00 17 - | CHH
[5] CMiso1.1chr00 18 - | CHH
... ... ... ... . ...
[121817212] CMiso1.1chr12 26620101 - | CHH
[121817213] CMiso1.1chr12 26620105 + | CG
[121817214] CMiso1.1chr12 26620106 - | CG
[121817215] CMiso1.1chr12 26620107 - | CHG
[121817216] CMiso1.1chr12 26620108 - | CHH
-------
seqinfo: 13 sequences from an unspecified genome
> head(methylome)
GRanges object with 6 ranges and 2 metadata columns:
seqnames ranges strand | context counts
<Rle> <IRanges> <Rle> | <factor> <matrix>
[1] CMiso1.1chr00 3 + | CHH 0:0
[2] CMiso1.1chr00 6 - | CHH 0:0
[3] CMiso1.1chr00 10 - | CHH 0:0
[4] CMiso1.1chr00 17 - | CHH 0:0
[5] CMiso1.1chr00 18 - | CHH 0:0
[6] CMiso1.1chr00 21 + | CHG 0:0
-------
seqinfo: 13 sequences from an unspecified genome
the error building the model:
> if (intermediate==TRUE){
+ model <- callMethylation(data = methylome, transDist = fit$transDist, include.intermediate=intermediate , update=intermediate_mode)
+ }else{
+ model <- callMethylation(data = methylome, transDist = fit$transDist, include.intermediate=intermediate)
+ }
Adding distance ... 5.09s
Adding transition context ... 37.76s
Baum-Welch: Fitting HMM parameters
Iteration log(P) dlog(P) Time in sec
0 -inf - 0
1 0.000000 inf 43
HMM: Error in Baum-Welch: nan detected
Time spent in Baum-Welch: 77.49s
Compiling results ... 0s
Warning message:
In callMethylation(data = methylome, transDist = fit$transDist, :
Baum-Welch aborted: nan detected
Could you send me the "methylome" and "fit" objects as RData file? I will need to reproduce the bug in order to know where it originates.
Hi,
first of all, thank you for this work it is very usefull.
I want to analyse WGBS data with your package. I have create cx_report et all necessary files. But when I go further in the analysis I cannot have result at the end.
In fact after use importBismark() function I have no warning and the output is like this :
below you have the second necessary files (chromosome size):
At this part of the code I don't see issue, but then when I try to launch the inflateMethylome() function I have some warning ( I don't know if they cause trooble).
I don't understand this warning because my seq level are the same, I mean I have the same chromosome pattern and the same number of chromosome as you can see below :
So when I try to use inflateMethylome() function I have in output a Granges with the columns count at 0:0 that it is strange if I have understand your code. See below the Grange:
This resulting in troobleshooting in the estimateTransDist after distance correlation computing:
Finally when I whant to call the methylation I obtain in result a warning message ( as you can see below) with nan-detected:
Due to this multiple warning I cannot have results because the model has an empty counts columns empty:
Do you have any idea of what I'm doing wrong please ? Thanks in advance