caravagnalab / rRACES

R wrapper for the RACES package
GNU General Public License v3.0
2 stars 1 forks source link

unkwown chromosome 6 #79

Closed riccardobergamin closed 9 months ago

riccardobergamin commented 9 months ago

I tried to build a mutation engine with a different chromosome. In this example chromosome 6

reference_url <- paste0("https://ftp.ensembl.org/pub/grch37/current/",
                        "fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.",
                        "dna.chromosome.6.fa.gz")

SBS_url <- paste0("https://cancer.sanger.ac.uk/signatures/documents/2123/",
                  "COSMIC_v3.4_SBS_GRCh37.txt")

drivers_url <- paste0("https://raw.githubusercontent.com/",
                      "caravagnalab/rRACES/main/inst/extdata/",
                      "driver_mutations_hg19.csv")

passenger_CNAs_url <- paste0("https://raw.githubusercontent.com/",
                             "caravagnalab/rRACES/main/inst/extdata/",
                             "passenger_CNAs_hg19.csv")

germline_url <- paste0("https://www.dropbox.com/scl/fi/g9oloxkip18tr1r",
                       "m6wjve/germline_data_demo.tar.gz?rlkey=15jshul",
                       "d3bqgyfcs7fa0bzqeo&dl=1")

m_engine <- build_mutation_engine(directory = "Chr6",
                                  reference_src = reference_url,
                                  SBS_src = SBS_url,
                                  drivers_src = drivers_url,
                                  passenger_CNAs_src = passenger_CNAs_url,
                                  germline_src = germline_url)

m_engine$add_mutant(mutant_name = "AML",
                    passenger_rates = c(SNV = 1e-8,CNA = 1e-11),
                    driver_SNVs = c(),
                    driver_CNAs = c())

m_engine$add_exposure(time = 0,
  coefficients = c(SBS1 = 0.2,SBS5 = 0.8))

phylo_forest <- m_engine$place_mutations(forest, 1000)

but i get Unknown chromosome 6 when i run the place_mutations command.

albertocasagrande commented 9 months ago

Your script downloads the file germline_data_demo.tar.gz to build the mutation engine. This file is for demo purposes and it exclusively contains data about chromosome 22.

Users must download the complete germline data set to deal with different chromosomes. Please, use this URL for GRCh37 or this URL for GRCh38.

reference_url <- paste0("https://ftp.ensembl.org/pub/grch37/current/",
                        "fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.",
                        "dna.chromosome.6.fa.gz")

SBS_url <- paste0("https://cancer.sanger.ac.uk/signatures/documents/2123/",
                  "COSMIC_v3.4_SBS_GRCh37.txt")

drivers_url <- paste0("https://raw.githubusercontent.com/",
                      "caravagnalab/rRACES/main/inst/extdata/",
                      "driver_mutations_hg19.csv")

passenger_CNAs_url <- paste0("https://raw.githubusercontent.com/",
                             "caravagnalab/rRACES/main/inst/extdata/",
                             "passenger_CNAs_hg19.csv")

germline_url <- paste0("https://www.dropbox.com/scl/fi/g9oloxkip18tr1r",
                       "m6wjve/germline_data_demo.tar.gz?rlkey=15jshul",
                       "d3bqgyfcs7fa0bzqeo&dl=1")

m_engine <- build_mutation_engine(directory = "Chr6",
                                  reference_src = reference_url,
                                  SBS_src = SBS_url,
                                  drivers_src = drivers_url,
                                  passenger_CNAs_src = passenger_CNAs_url,
                                  germline_src = germline_url)

m_engine$add_mutant(mutant_name = "AML",
                    passenger_rates = c(SNV = 1e-8,CNA = 1e-11),
                    driver_SNVs = c(),
                    driver_CNAs = c())

m_engine$add_exposure(time = 0,
  coefficients = c(SBS1 = 0.2,SBS5 = 0.8))

phylo_forest <- m_engine$place_mutations(forest, 1000)

If a full genome analysis is needed, I suggest to use the appropriate set-up code.

# download everything needed for a full genome analysis
# and set-up the mutation engine directory
m_engine <- build_mutation_engine(setup_code="GRCh38")

m_engine$add_mutant(mutant_name = "AML",
                    passenger_rates = c(SNV = 1e-8,CNA = 1e-11),
                    driver_SNVs = c(),
                    driver_CNAs = c())

m_engine$add_exposure(time = 0,
  coefficients = c(SBS1 = 0.2,SBS5 = 0.8))

phylo_forest <- m_engine$place_mutations(forest, 1000)
riccardobergamin commented 9 months ago

ok, i see, i didn't think about it. Thank you !!!