Closed rdmorin closed 2 years ago
get_gene_expression was updated to take two new parameters, all_genes
and expression_data
.
If all_genes
is set to TRUE, the full expression df will be returned (no subsetting on genes specified either in hugo_symbols or ensemble_gene_ids). Error message for not calling either hugo_symbols or ensemble_gene_ids has been updated to not return an error message if all_genes are set to TRUE (and no genes specified).
Additional optional parameter (expression_data) can be used to use loaded expression data frame directly, preventing this data to be read into R again (from flat file or database).
Examples for the function have also been updated to reflect the above-described update.
These changes have been pushed in this commit
This issue has been resolved in the commit described above.
Currently the
get_gene_expression
function requires the user specify a set of gene IDs (ENSG or HGNC) and it subsets the tidy data frame based on that information. We should add functionality to this to allow the user to specify that they want to get the full matrix back. An empty gene list is probably not the right approach since it could give an unsuspecting user a massive data frame unintentionally. If we add an another parameter that is defaulted to FALSEall_genes=FALSE
then check for that OR a gene list, we should be able to return the full data frame. To make this functionality helpful we also need the function to accept the same data frame as input and (when provided) use it directly and skip the step of loading it from disk. The purpose of this is to avoid users having to re-load that data from disk multiple times if they plan on running this function on different gene sets in an interactive session. Hence, the function will need a second new argumentfull_expression_df
or something similarly named that is optional.