microbiome / benchmarking

Benchmarking miaverse performance
Artistic License 2.0
0 stars 0 forks source link

Introduce fixed number of samples and split experiments #2

Closed RiboRings closed 2 years ago

RiboRings commented 3 years ago

Hey!

I modified both the long speed comparisons and its split version so that the number of samples N can be fixed. In addition, both the full and the split versions have their output as md files in their corresponding folders.

The plots (example) mainly hint for linearity, but unfortunately I couldn't find any data set with number of features in the range of 3000 - 8000, which causes a gap to appear in the central part of the plots. Any solution for that?

Cheers, RiboRings

antagomir commented 3 years ago

Thanks a lot!

Some suggestions:

1) it is not sustainable to maintain two parallel versions (long and split); for the sake of maintenance, I would propose to remove the long version and only keep the split versions because they can be more easily debugged and complemented one-by-one.

2) can we have links to each output file (md) in the README so that it is easy to navigate from the landing page.

3) add a master script (e.g. main.R that reproduces the full analysis, ie. runs to render commands for all Rmd files; then one can reproduce all analyses at once by just running that script)

4) Let us work through the melt_benchmark.md as an example; then we can apply similar strategy for the others. I am making a separate PR on this shortly.