finnp / dat-ckan-datastore

4 stars 0 forks source link

Working with latest iteration of dat? #1

Open Analect opened 9 years ago

Analect commented 9 years ago

@finnp I stumbled across your repo and wanted to ascertain if it was still compatible with the latest dat, which is apparently close to 'beta'.

I was working through this useful tutorial on the dat repo. http://maxogden.github.io/get-dat

Is dat-ckan-datastore intended to enable the ckan datastore to handle requests for data from dat running elsewhere? ... as well as potentially import data in dat form ... ie. be swapped out as the backend instead of leveldb?

If so, how is the API key information handled, if at all. Also, is the datastore able to support this blob by how functionality in dat?

Thanks.

finnp commented 9 years ago

Hi @Analect,

this project is about importing data from a ckan store into dat only. it's quite general, so importing should work for the latest version of dat. (you should try it yourself to be sure)

An experiment for using CKAN as a dat backend is: https://github.com/finnp/ckanDOWN . However I kind of abandoned it, because you were able to use ckan as a backend, but the data was not usable on CKAN, which didn't seem useful to me.

For me it would be interesting to know: What's your specific use case?

best, Finn

Analect commented 9 years ago

@finnp Sorry for the delayed response. I've been following the dat project for a while and was watching Max Ogden's recent presentation at Berkeley here: https://www.youtube.com/watch?v=psmtJUyZHE0

In terms of my use-case ... I've been experimenting with using CKAN for macroeconomic data, with an emphasis on leveraging the datastore, where I'm generally interacting directly with the datastore API by using either python or more recently ckanr (a R interface from ropensci).

The appealing part of dat for me, if I understand it fully, is the ability to encapsulate an ETL-like processes on raw data within the dat package and at the same time have a form of version control built-in too. Currently, any manipulation of raw data to get it into a format/shape to upload to the datastore needs to be managed by me externally, so having that encapsulation in a dat package that can then leverage CKAN as a back-end for both generating and consuming dat packages has its attractions.

In terms of your work on ckanDOWN ... you mentioned the data isn't accessible from CKAN itself. Is that because it's saving the dat package as a binary object? I'm wondering if your approach relates to others seeking to leverage the jsonb capabilities introduced in postgres 9.4, such as this CKAN roadmap issue: https://github.com/ckan/ideas-and-roadmap/issues/143. Could something like that extension (https://github.com/joetsoi/ckanext-jsondatastore) allow you to consume dat-type structures saved within CKAN ... or am I missing the point?

Thanks, Colum

Analect commented 9 years ago

@finnp ... just seeing if the conversation could continue?!