recommend adding example data files for documentation

LebeerLab / tidytacos

Functions to manipulate and visualize microbial community data

https://lebeerlab.github.io/tidytacos/

GNU General Public License v3.0

9 stars 1 forks source link

recommend adding example data files for documentation #50

Closed kelly-sovacool closed 1 month ago

kelly-sovacool commented 3 months ago

The quickstart explains how to load files into tidytacos:

taco <- read_tidytacos("/path/to/my_data")

but then proceeds to use the urt dataset for the rest of the tutorial, which is already a tidytacos object.

It would be helpful to have an example file in your package in inst/extdata and use it in the tutorial, so users will have a better understanding of the expected file format.

Additionally, it would be best practice to have code in data-raw to show how urt and leaf were created. https://r-pkgs.org/data.html#sec-data-data-raw

kelly-sovacool commented 3 months ago

Moreover, the quick start guide directs users to the source code to learn about more options for importing data.

More options to import and convert your data can be found here.

I strongly discourage this. Novice users should not be expected to read the source code to learn how to use the package. This information should be distilled into a vignette for users to read.

kelly-sovacool commented 1 month ago

I see you added raw data to data-raw. The R package convention is to place raw data files in inst/extdata/ and include R scripts in data-raw/ showing how datasets in data/ were generated. data-raw/ is not installed with the package, so users will not be able to access it unless it is in inst/extdata/.

So this means your csv files currently in data-raw/ should be moved to inst/extdata/ (see https://r-pkgs.org/data.html#sec-data-extdata), and you should create R scripts in data-raw/ which show how the binary datasets in data/ were created (see https://r-pkgs.org/data.html#sec-data-sysdata).

kelly-sovacool commented 1 month ago

on a related note, this snippet in the README does not work for end users:

https://github.com/LebeerLab/tidytacos/blob/5b585c2c140ee08474729d2dc52bec8ea9c88cc3/README.md?plain=1#L60

once you move the example files to their proper locations as described above, you will be able to do something like this instead:

taco <- read_tidytacos(system.file("extdata", "tidytacos", "leaf", package = "tidytacos"))

similarly, it would be nice (but not strictly required) if your dada and phyloseq examples worked: https://github.com/LebeerLab/tidytacos/blob/5b585c2c140ee08474729d2dc52bec8ea9c88cc3/README.md?plain=1#L37-L44