This is a SVM based model to predict transcriptional AML subtypes.
To use this R package, you need to use STAR (we used version: 2.7.5c--0) to allign and quantify your counts. As Index and GTF use the following files: GDC.h38.d1.vd1 STAR2 Index Files (v36) & GDC.h38 GENCODE v36 GTF. We included an example snakemake file to see the options used and get you on your way quicker.
Merge the ReadsPerGene.out.tab files into a matrix with samples on the rows and the gene counts on the columns. Make sure you select the right count column according to strandness. For your R matrix, set sample-ids as rownames, and ensembl-ids (column 1 of ReadsPerGene.out.tab) as colnames. If you already have counts the colnames should be ordered as in the example file (AMLmapR::example_matrix).
First install the package.
library(devtools)
install_github("jeppeseverens/AMLmapR")
Then you can predict the transcriptional subtypes for your AML cases. Important: do not normalise or log transform counts.
library(AMLmapR)
# Should be of class Matrix
# use as.matrix(matrix[,colnames(AMLmapR::example_matrix)]) on your own file if needed.
example_matrix <- AMLmapR::example_matrix
# Predict classes
predictions <- predict_AML_clusters(example_matrix)
If you used this work for a publication, please reference my publication:
Mapping AML heterogeneity - multi-cohort transcriptomic analysis identifies novel clusters and divergent ex-vivo drug responses https://www.nature.com/articles/s41375-024-02137-6
This work is available under the CC BY-NC-SA 4.0 license. You can share and addapt this work, as longs as you give appropiate credit. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. Publication of this work is meant to contribute to open scientific research. You may not use the material for commercial purposes.
Jeppe Severens - the Netherlands Mail: jeppe.severens@gmail.com