UMCUGenetics / MutationalPatterns

R package for extracting and visualizing mutational patterns in base substitution catalogues
MIT License
104 stars 45 forks source link

Cosine similarity to COSMIC DBS and Indel signatures #62

Closed maanmi closed 3 years ago

maanmi commented 3 years ago

Dear authors,

I was very excited to find out that v3 has support for DBS and Indel- however my goal is to ultimately calculate the cosine similarity between a mutational profile and all types of COSMIC signatures. The latest vignette only mentions how to do this for SBS signatures and I've tried out different things for DBS and Indel but without success. Is there any way to do this with the current release? If not, is it possible to make it a feature for the next release?

Thank you very much in advance.

FreekManders commented 3 years ago

Hi, This should work for DBS and indels as well. For indels you can use the indel counts and indel signatures with the cos_sim_matrix function. For DBS you would use the DBS counts and DBS signatures. Below is an example of how to do this.

# Get indel counts from example data
indel_counts <- readRDS(system.file("states/blood_indel_counts.rds",
  package = "MutationalPatterns"
))

# Get signatures
indel_signatures <- get_known_signatures("indel")

# Calculate cosine similarity
cos_sim_matrix(indel_counts, indel_signatures)
maanmi commented 3 years ago

Thanks Freek, that worked quite well actually.