In the create_conditional_likelihood_of_base_dict function, it seems that only a subset of all possible genotype combinations is being defined with conditional likelihood functions. Given that we have two bases (A and B), theoretically, there should be
3×3×2=18 conditional likelihood functions to account for all possible genotype combinations.
However, combinations such as AAAB_A, AAAB_B, ABBB_A, and ABBB_B are missing in the current implementation.
For instance, in the function we have:
But, functions for genotypes like AAAB and ABBB are not defined, although these combinations could be relevant under certain conditions.
Questions:
Is there a specific reason why some genotype combinations were excluded?
Would adding these missing combinations (such as AAAB_A, AAAB_B, ABBB_A, and ABBB_B) improve the robustness of the likelihood calculations, especially in cases where these genotypes might occur?
Proposed Solution: If there are no specific constraints, I suggest defining conditional likelihood functions for all 18 combinations to ensure comprehensive genotype likelihood coverage.
Thank you for your time and looking forward to your insights on this!
Description:
In the create_conditional_likelihood_of_base_dict function, it seems that only a subset of all possible genotype combinations is being defined with conditional likelihood functions. Given that we have two bases (A and B), theoretically, there should be 3×3×2=18 conditional likelihood functions to account for all possible genotype combinations.
However, combinations such as AAAB_A, AAAB_B, ABBB_A, and ABBB_B are missing in the current implementation.
For instance, in the function we have:
But, functions for genotypes like AAAB and ABBB are not defined, although these combinations could be relevant under certain conditions.
Questions:
Is there a specific reason why some genotype combinations were excluded? Would adding these missing combinations (such as AAAB_A, AAAB_B, ABBB_A, and ABBB_B) improve the robustness of the likelihood calculations, especially in cases where these genotypes might occur? Proposed Solution: If there are no specific constraints, I suggest defining conditional likelihood functions for all 18 combinations to ensure comprehensive genotype likelihood coverage.
Thank you for your time and looking forward to your insights on this!