We have functionality in GAMBLR to generate the feature matrix for simple mutations (get_coding_ssm_status()) and another function that returns copy number state of the gene of interest (get_cn_states()). What is missing is the wrapper function that will run both of these to generate a single binary matrix (0 for no feature and 1 for the presence of the feature) where either mutation or CNV will be considered.
In this example, 3 samples do not have SSM (the ssm_matrix has 0 for the mutation presence), but they have CNV (the cn_matrix has copy number higher than 2). The new function will aggregate these events, and all samples in the example will have 1 for the combined feature MYC_Mut_or_AMP
Since this will be handling the CN data, there should be a function parameter to dynamically handle a cutoff for the absolute CN when considering the event as a feature (for example, we can disregard one copy gains with CN of 3 or 2 copy gains with CN of 4 etc).
We have functionality in GAMBLR to generate the feature matrix for simple mutations (
get_coding_ssm_status()
) and another function that returns copy number state of the gene of interest (get_cn_states()
). What is missing is the wrapper function that will run both of these to generate a single binary matrix (0 for no feature and 1 for the presence of the feature) where either mutation or CNV will be considered.Here is an example:
In this example, 3 samples do not have SSM (the ssm_matrix has 0 for the mutation presence), but they have CNV (the cn_matrix has copy number higher than 2). The new function will aggregate these events, and all samples in the example will have 1 for the combined feature MYC_Mut_or_AMP
Since this will be handling the CN data, there should be a function parameter to dynamically handle a cutoff for the absolute CN when considering the event as a feature (for example, we can disregard one copy gains with CN of 3 or 2 copy gains with CN of 4 etc).