caravagnalab / rRACES

R wrapper for the RACES package
GNU General Public License v3.0
2 stars 1 forks source link

Simulate sequencing with Illumina error #111

Closed giorgiagandolfi closed 1 month ago

giorgiagandolfi commented 1 month ago

Hi, I am trying to simulate sequencing data from my rRACES forest by using the Illumina basic sequencer error. This is my command:

rm(list = ls())
library(rRACES)
library(dplyr)
library(ggplot2)
library(patchwork)

seed <- 12345
set.seed(seed)

phylo_forest <- load_phylogenetic_forest("data/phylo_forest.sff")

# Simulate sequencing ####

basic_seq <- new(BasicIlluminaSequencer, 4e-3)

simulate_seq(phylo_forest, coverage = 80, write_SAM = TRUE, update_SAM =TRUE,
             sequencer = basic_seq,
             output_dir = "/orfeo/LTS/CDSLab/LT_storage/ggandolfi/races_simulations/sequencing_80X_with_error",rnd_seed = seed)

But I got this error:

 *** caught segfault ***
address 0x7a930b4e0, cause 'memory not mapped'

Traceback:
 1: .External(list(name = "InternalFunction_invoke", address = <pointer: 0x19dffb0>,     dll = list(name = "Rcpp", path = "/orfeo/cephfs/opt/programs/intel/fedora37/R/4.2.3/lib64/R/library/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0xc9b190>, info = <pointer: 0x1437d40>),     numParameters = -1L), <pointer: 0x2ef2990>, phylo_forest,     sequencer, chromosomes, coverage, read_size, insert_size,     output_dir, write_SAM, update_SAM, cell_labelling, purity,     with_normal_sample, rnd_seed)
 2: simulate_seq(phylo_forest, coverage = 80, write_SAM = TRUE, update_SAM = TRUE,     sequencer = basic_seq, output_dir = "/orfeo/LTS/CDSLab/LT_storage/ggandolfi/races_simulations/sequencing_80X_with_error")
An irrecoverable exception occurred. R is aborting now ...
/var/spool/slurm/d/job434972/slurm_script: line 18: 1814660 Segmentation fault      (core dumped) Rscript 03_sequencing.R
albertocasagrande commented 1 month ago

@giorgiagandolfi, can you try to execute the following R lines?

library(rRACES)

phylo_forest <- load_phylogenetic_forest("data/phylo_forest.sff")

phylo_forest$get_nodes()
giorgiagandolfi commented 1 month ago

Here the results:

> phylo_forest$get_nodes() %>% head()
  cell_id ancestor  mutant epistate sample birth_time
1       0       NA Clone 1            <NA>    0.00000
2       2        0 Clone 1            <NA>   10.28257
3       4        2 Clone 1            <NA>   13.18403
4       5        4 Clone 1            <NA>   25.02455
5      11        5 Clone 1            <NA>   32.17671
6      20       11 Clone 1            <NA>   42.06884

also a summary of the sample column

> phylo_forest$get_nodes() %>% count(sample)
          sample    n
1 SPN01_Sample_1  224
2 SPN01_Sample_2  225
3 SPN01_Sample_3  225
4           <NA> 3009
albertocasagrande commented 1 month ago

Can you post the data/phylo_forest.sff file?

albertocasagrande commented 1 month ago

Do you have the same problem when you build the phylogenetic forest from scratch? Can you post a complete example that raises the issue?

albertocasagrande commented 1 month ago

Everything seems to work correctly. It could be a problem related to some library upgrade: when it occurs, the R packages should be rebuilt.