openwashdata / washr

An R package to make publishing openwashdata resources easier
https://openwashdata.github.io/washr/
GNU General Public License v3.0
1 stars 0 forks source link

Increase FAIRness of openwashdata by adding functions that export to different metadata schemas / enrich metadata #25

Open larnsce opened 5 days ago

larnsce commented 5 days ago

This can be part of openwashdata phase 2 WP4: Increase FAIRness: https://openwashdata.org/pages/gallery/proposal-02/#wp4-increase-fairness

The idea is to use the existing metadata we have, enrich it and then export to other metadata schemas. An example comes from the dataspice R package to prepare data publications.

I am thinking particularly of the write_spice() function, which writes metadata from a set of CSVs into the JSON-LD:

https://docs.ropensci.org/dataspice/reference/write_spice.html

Package: https://docs.ropensci.org/dataspice/

We should review this workflow and adapt some of it to our own needs.

Another one is the Frictionless Data Table Schema: https://specs.frictionlessdata.io//table-schema/

Lastly, I think we should consider building a proper data catalogue using a data management system like CKAN: https://ckan.org/

@bonschorno @yashdubey132: we can split these up into different issues, but I would like you two to work on this.

larnsce commented 5 days ago

Here is our current metadata table, which is all created manually. We could extract more information from the existing packages (e.g. nrow / ncol) but also add additional manual fields. https://docs.google.com/spreadsheets/d/1vtw16vpvJbioDirGTQcy0Ubz01Cz7lcwFVvbxsNPSVM/edit?gid=0#gid=0

larnsce commented 4 days ago

I am adding information here that comes from Asana and is about the feasibility of sharing our data also as packages for Python.


understand the need of extending GHE functionality to Python discuss the necessity of building python pacakges

Some resources: Instruction: https://towardsdatascience.com/step-by-step-guide-to-creating-r-and-python-libraries-e81bbea87911 https://docs.python-guide.org/writing/structure/ Opinion: https://www.ethanrosenthal.com/2022/02/01/everything-gets-a-package/ https://packaging.python.org/en/latest/tutorials/packaging-projects/ Example http://www.data8.org/zero-to-data-8/datascience.html https://mintcanary.com/frictionlessdata/tools/

Take-away from 24/01 discussion: Nic has a proof-of-concept python pkg on wasteskipsblantyre current design is inside an owd R data package pro: less redundant con: can be confusing for most users and complica


Here also the feedback from @n-raspi: https://docs.google.com/document/d/1DkzGRQTGXS0IuT_hFEFSUSLJIYczr5LXng10ELTwp78/edit#heading=h.6gmwdlbbkrxh