Open larnsce opened 5 days ago
Here is our current metadata table, which is all created manually. We could extract more information from the existing packages (e.g. nrow / ncol) but also add additional manual fields. https://docs.google.com/spreadsheets/d/1vtw16vpvJbioDirGTQcy0Ubz01Cz7lcwFVvbxsNPSVM/edit?gid=0#gid=0
I am adding information here that comes from Asana and is about the feasibility of sharing our data also as packages for Python.
understand the need of extending GHE functionality to Python discuss the necessity of building python pacakges
Some resources: Instruction: https://towardsdatascience.com/step-by-step-guide-to-creating-r-and-python-libraries-e81bbea87911 https://docs.python-guide.org/writing/structure/ Opinion: https://www.ethanrosenthal.com/2022/02/01/everything-gets-a-package/ https://packaging.python.org/en/latest/tutorials/packaging-projects/ Example http://www.data8.org/zero-to-data-8/datascience.html https://mintcanary.com/frictionlessdata/tools/
Take-away from 24/01 discussion: Nic has a proof-of-concept python pkg on wasteskipsblantyre current design is inside an owd R data package pro: less redundant con: can be confusing for most users and complica
Here also the feedback from @n-raspi: https://docs.google.com/document/d/1DkzGRQTGXS0IuT_hFEFSUSLJIYczr5LXng10ELTwp78/edit#heading=h.6gmwdlbbkrxh
This can be part of openwashdata phase 2 WP4: Increase FAIRness: https://openwashdata.org/pages/gallery/proposal-02/#wp4-increase-fairness
The idea is to use the existing metadata we have, enrich it and then export to other metadata schemas. An example comes from the
dataspice
R package to prepare data publications.I am thinking particularly of the
write_spice()
function, which writes metadata from a set of CSVs into the JSON-LD:https://docs.ropensci.org/dataspice/reference/write_spice.html
Package: https://docs.ropensci.org/dataspice/
We should review this workflow and adapt some of it to our own needs.
Another one is the Frictionless Data Table Schema: https://specs.frictionlessdata.io//table-schema/
Lastly, I think we should consider building a proper data catalogue using a data management system like CKAN: https://ckan.org/
@bonschorno @yashdubey132: we can split these up into different issues, but I would like you two to work on this.