openwashdata / data

The issue tracker on this repository has the purpose to collect ideas for data to be donated, cleaned, and published. Check out current ideas and add your own.
https://github.com/openwashdata/data/issues
1 stars 0 forks source link

[data] wpdxdata: an R package for the Water Point Data Exchange data #10

Open larnsce opened 1 year ago

larnsce commented 1 year ago

Context

Question

larnsce commented 1 year ago

Email response:

@mianzg: Do you have enough to get going with this package?

I suggest we call it wpdx. This package has great potential to be published on CRAN and a package with that name doesn't currently exist: https://cran.r-project.org/web/packages/available_packages_by_name.html

mianzg commented 1 year ago

@larnsce I checked the resources. This dataset set-up would be quite different from our current practice.

The data is pretty tidy but the size is huge (~800,000 rows in basic), and it's stored and updated remotely. I also tried basic functions of frictionlessdta for enhanced data (half of basic version). Again, to load full data might be hard if internet or hardware is not good enough.

Here we need to write more R functions to load and interact with the data, some initial considerations are:

@mbannert If you got experience with large size data like this, let's talk!

Nevertheless, I will go ahead and create a repo to initiate

larnsce commented 1 year ago

We have a PostgreSQL database that we could add the data to and then call based on a query from the user.

mianzg commented 5 months ago

Consider to work on this for openwashdata hackathon