openwashdata / data

The issue tracker on this repository has the purpose to collect ideas for data to be donated, cleaned, and published. Check out current ideas and add your own.
https://github.com/openwashdata/data/issues
1 stars 0 forks source link

[data] Addendum of data related to drying of faecal sludge from on-site sanitation facilities and fresh faeces #1

Open larnsce opened 1 year ago

larnsce commented 1 year ago

One-line summary

The FS Methods book was published with a PDF on Gates Open Research that contains metadata and dropbox links to approximately 100 MS Excel sheets, which could provide a great resource if they were published in a machine-readable and structured format.

Background information

The present document consists of an addendum of data to the Handbook of Methods for Faecal Sludge Analysis. It is part of a project funded by the Bill & Melinda Gates (BMGF) through the OPP1164143, untitled “Characterization of faecal material during drying”. Data was shared by partners of the PRG over 5 years. It's commendable that this document exists, but it's no use to anyone in this format.

https://gatesopenresearch.org/documents/4-188

Concrete proposal

Create a datapackage that contains all the information from the different MS Excel files, together with metadata. Make it searchable, machine-readible, and structured. Steps:

Pros and cons

Pros

Cons

Alternatives

mianzg commented 1 year ago

Explore how to get raw data hidden in the data source of each table in the pdf: R packages needed: pdftools

mianzg commented 1 year ago

@larnsce For this data, each table has the following sections

I have already extracted general information, publications, and data source links. Shall we set up a package for this?

larnsce commented 1 year ago

Thanks, @mianzg. Yes, please set up a package. For the data sources, please establish a Zotero library in our GHE group library. We can then use the .bib file to reference each data source.

mianzg commented 1 year ago

@larnsce For this data, each table has the following sections

  • General information
  • Feedstock
  • Experiment procedure
  • Publications
  • Data source links
  • Additional notes
  • Description of data (including some figures)

I have already extracted general information, publications, and data source links. Shall we set up a package for this?

Great! How about naming it asdryingfaecal? @larnsce

I forgot to mention that I want to exclude additional notes and description of data for the meta-data table. Sections feedstock and experiment procedure need more investigation to be extracted.

larnsce commented 1 year ago

I forgot to mention that I want to exclude additional notes and description of data for the meta-data table.

Agreed. Not necessary.

Great! How about naming it as dryingfaecal?

I like it.

mianzg commented 1 year ago

The metadata table and the package setup is ready: https://openwashdata.github.io/dryingfaecal/