Closed kewei2019 closed 1 year ago
Here is a brief description of each column in this output.
0
is canonical. Other indices follow from the labels called by the model. For a 6mA-only model the only other value would be 1
indicating a ground truth modified positionHi, more quick questions:
For the mod_probs, actually it contains two floats, like: "0.341796875,0.658203125" and "0.990234375,0.009765625". So which float represent the probability at one position?
For the gt_mod_idx, you said "0 is canonical, 1 is ground truth modified position", but I found a lot of positions that are "none", what does ''none" mean?
As found in my previous answer "mod_probs: Probability of each label output by the model at this position
", these represent the probabilities of each output label. The labels are noted in the log file. For example with canonical cytosine (C
), 5hmC (h
) and 5mC (m
) you might see the labels as Chm
. This indicates that the order of the produced probabilities from the model.
None indicates that there is a modified base call, but no ground truth label for that position.
Hi, I have successfully used Remora to analyze my trained bacteria genome for m6A in GATC sites. However, I am unsure about the meaning of each lane in the output and how to determine whether a site is methylated or not.
And correct me if wrong: query_name: reads' name ref_name: chromosome name ref_pos: position on chromosome
Thanks very much! Kewei