mortonjt / mds-approximations

Repository to hold new mds approximations code. The goal is to test all the ones we have access too and just move the best to skbio.
0 stars 0 forks source link

Simulation code #3

Open mortonjt opened 7 years ago

mortonjt commented 7 years ago

We'll want rigorous benchmarking datasets to test the integrity of these metrics.

Now, I do have a separate benchmarking repository so we can probably port over some of the code there to fit our needs.

I'm thinking that this should be a two tier simulation.

  1. Obtain real datasets for well characterized studies
  2. Subsample reads from these datasets via multinomial sampling
  3. Perform the matrix completion and determine how well it fits the original data

We'll need to figure out how we want to subsample reads - probably will want to assign non-zero probabilities to everything, but give it an extremely small probability of being observed if it wasn't observed in the original study.