Open Sepideh89 opened 1 month ago
Substitutions are simply with respect to the reference sequence, and do not take in to consideration coding sequences that could be present on either DNA strand. For this reason is was felt relevant to sum across paired bases. In hindsight this has raised many questions such as yours and is sufficiently non-standard that we should change the graphic to be a simple 4 x 4 matrix.
Ask away!
I ran the wf-human-variation workflow for SNP calling. In the SNP report HTML file, in the Substitution types, they only report changes from A or C to A, C, G, or T. It seems odd that there are no changes from G or T reported. The small print says that the data was ‘symmetrised by pairing’. Is that the reaon?