morinlab / GAMBLR

Set of standardized functions to operate with genomic data
MIT License
4 stars 2 forks source link

get_gene_expression doesn't work if genes are provided as factors #237

Open rdmorin opened 1 year ago

rdmorin commented 1 year ago

When get_gene_expression is run with a vector of Hugo_Symbol or Ensembl_ID provided as factors (instead of a character vector) it doesn't work properly. Behind the scenes the factor levels (as numeric values) are used in the grep instead of the actual gene names. This should be an easy fix. I suggest we explicitly cast the genes vector to a character vector in the function using as.character().

Here's the output I see if I provide Hugo_Symbol directly from the bundled wright_genes data frame.

grep -w -F -e Hugo_Symbol -e 103 -e 161 -e 108 -e 102 -e 137 -e 82 -e 97 -e 91 -e 98 -e 184 -e 100 -e 171 -e 146 -e 147 -e 157 -e 11 -e 53 -e 18 -e 92 -e 99 -e 175
rdmorin commented 11 months ago

Still an open issue? Maybe we need to drop this grep-based support entirely.