Closed penuts7644 closed 4 years ago
Hi Wout, This combination of three edges in the graph allows to learn the joint probability P(V,D,J). Because IGoR treats all alleles as different genes, this joint probability is useful to capture the fact that some alleles of the same gene might not be able to recombine together as they lie on different chromosomes. In a nutshell this is needed to fine tune a model to an individual of interest.
The version used by OLGA learns the factorized version P(V)P(D,J) of the gene usage probability. It will miss all information from the chromose organisation/partition of the alleles. Learning at least the P(D,J) joint probability is essential to learn the biological impossibility to recombine J2 with D2. As I am not responsible for OLGA support/developpment I cannot tell why they made the choiceof having only this conditionnal dependence, and this migth be due to some algorithmic issue...
Hi Quentin,
I was recently looking into some model comparisons and noticed that for the default IGoR TCRB human model has edges between the V-gene choice with the D-gene of J-gene choices as well as the J-gene choice with the D-gene choice:
However, when I compare this to the TCRB human model that OLGA supplies by default or the ones I'm constructing locally. These only have the edge with the J-gene choice against the D-gene choice:
Do you have any idea why this is and how it is possible to make a model with the additional gene choice edges?
Cheers, Wout