Simulation code - Githubissues

We'll want rigorous benchmarking datasets to test the integrity of these metrics.

Now, I do have a separate benchmarking repository so we can probably port over some of the code there to fit our needs.

I'm thinking that this should be a two tier simulation.

Obtain real datasets for well characterized studies
Subsample reads from these datasets via multinomial sampling
Perform the matrix completion and determine how well it fits the original data

We'll need to figure out how we want to subsample reads - probably will want to assign non-zero probabilities to everything, but give it an extremely small probability of being observed if it wasn't observed in the original study.

mortonjt / mds-approximations

Simulation code #3