LundBladderCancerGroup / LundTaxonomy2023Classifier

GNU General Public License v2.0
1 stars 0 forks source link

Plotting error - data imputation does not extend to signature genes #1

Open heathergeigerbrion opened 2 months ago

heathergeigerbrion commented 2 months ago

Getting an error in plot_signatures function:

Error in quantile.default(late_early, 0.05) : missing values and NaN's not allowed if 'na.rm' is FALSE

Looking in more detail at the function, it seems that the problem is likely missing gene names/IDs in my expression matrix. While I ran the classifier with impute=TRUE, it seems it only does imputation for the classification step, but this imputed data is not available in the results object and so is not picked up by plot_signatures.

Let me know if you need any more information. I have tried running with both Ensembl gene IDs or gene symbols (converting myself using the symbols from the GTF file) and am getting a similar error either way.

LundBladderCancerGroup commented 2 months ago

Hi, It does seem like you are missing genes involved in the cell cycle signatures in your matrix. The imputation function is meant only for the classification step since it imputes the result of the gene pair rules (TRUE or FALSE), not the gene expression values themselves. I updated the plotting function so that it accounts for the possibility of missing genes, but it will just skip plotting the genes that are missing. If this affects any of the scores that are calculated based on different sets of genes it will give you a warning. I hope now it works fine, let me know if you find any more problems. Thank you for using the package!