Simulate with existing data

Neurergus / MOSim

Multi-Omics Simulation in R

4 stars 1 forks source link

Simulate with existing data #1

Open llrs opened 4 years ago

llrs commented 4 years ago

Hi, Nice package for multi-omics! Many thanks for developing and maintaining it.

I have a multi-omic dataset and wanted to create a simulated dataset based on it. That dataset have a block of data coming from 16S sequencing. Is there any way to simulate it with MOSim? I would have expected that providing my own data to base the synthetic dataset could work but then I read that they must be from the accepted data types, and I only see these:

RNA-seq (compulsory)
DNase-seq
ChIP-seq
Methyl-seq
miRNA-seq

Also I'm not familiar enough with the code base, how are the provided datasets used for the simulation?

Many thanks

carolinamonzo commented 1 year ago

Hi llrs,

Unfortunately MOSim doesn't simulate data coming from 16S sequencing. It uses either the provided dataset from project Stategra to simulate a set of data types, or input data (corresponding to data types you listed) provided by the user. It uses a negative binomial to simulate the data in RNA-seq, DNase-seq, ChIP-seq and miRNA-seq, and a binomial distribution for Methyl-seq. If your 16S dataset follows a negative binomial distribution, you could test out the MOSim method on your dataset, however I cannot be sure it would give you the results you were looking for.

Best wishes, Carolina

Fred-White94 commented 1 year ago

Hello,

I'd like to reopen this discussion as I have a similar question, that was not fully answered. "how are the provided datasets used for the simulation?" I can see in the manual/vignette that they are used to extract IDs and to approximate the distribution. I haven't delved fully into the code but is there anywhere else where input/seed data is used to adapt the simulation?

Thank you for your time.