rufuspollock-okfn / data.okfn.org-new

Simple data package based data portal (and original site for frictionless data effort)
http://data.okfn.org/
6 stars 6 forks source link

Datasets Data URLs and API generally #6

Open rufuspollock opened 8 years ago

rufuspollock commented 8 years ago

From @rgrp on February 24, 2013 18:26

This issue is about the URL / API structure for accessing data (and metadata) from the data packages.

Current Situation

For /data/ and /community/ data packages:

/.../{dataset}/datapackage.json     # the datapackage.json file

## data urls
/.../{dataset}/r/{resource-name-or-order}.{format}  

so e.g.

/.../gdp/r/annual.csv   # resource name
/.../gdp/r/0.csv           # resource by index

Formats that we should support would be:

Longer-term we could support addressing individual elements e.g. addressing into rows in a dataset or :

.../gdp/r/annual/5/        # row 5 of this dataset, rendered as HTML by default
.../gdp/r/annual/5.csv  # in CSV format
.../gdp/r/annual/5/year/  # cell in row 5, field year (in HTML form by default)

.../{dataset}/r/{resource-name-or-index}/{row-index-or-primary-key}[.html | .csv | .json]
.../{dataset}/r/{resource-name-or-index}/{row-index}/{field-name-or-index}[.html | .csv | .json]

Questions:

We follow something similar to the other case but instead of data package name in the url we move the data package url to the query string:

/api/datapackage.json?url={datapackage-url}
/api/data/{resource-name-or-index}.{format}?{datapackage-url}

# e.g. this returns first resource as CSV
/api/data/0.csv?url=https://raw.github.com/datasets/browser-stats/master/datapackage.json

Discussion


Appendix

Alternatives

Alternatively could be:

{dataset}/{filename}.csv
{dataset}/{filename}.json (CORS enabled ...)

Or

{dataset}/data.csv

Think the former is better ...

Copied from original issue: frictionlessdata/ideas#19

rufuspollock commented 8 years ago

@mihi-tr wrote in #83:

I do think we'll need to think along the lines of having CORS enabled access for the datasets. Based on the dataset.json format (which allows relative urls) the api should look like

.../dataset/datapackage.json 

and then have

.../dataset/path-to-data/filename

for the data files - this way it doesn't matter which package url I got pointed at

Alternatively: modify datapackage.json - this is very ugly IMO

rufuspollock commented 8 years ago

@mihi-tr I don't know if you saw the extensive refactor of this proposal about a month ago. Please look at proposal above. As part of #73 I actually implemented most of the proposal at least for "core" datasets.

Please let me know if addresses your proxy need - and if there can be an even better API (I guess your biggest concern is which are not in core but note i propose a way to handle these - though not yet implemented).

rufuspollock commented 8 years ago

From @mihi-tr on November 30, 2013 19:22

It would (I think). Would need to test this in a practical environment.

rufuspollock commented 8 years ago

@mihi-tr here's an example of current api style: http://data.okfn.org/data/s-and-p-500-companies/r/constituents.csv

rufuspollock commented 8 years ago

Based on convo with @mihi-tr today downgrading priority to one star:

@mihi-tr still be nice to know what exactly should be in that tool ...

rufuspollock commented 8 years ago

Have updated proposal to flesh out the case for general online data packages - which are now think is the priority (given that we plan to not to much cataloging in this site and in this app).

rufuspollock commented 8 years ago

@mihi-tr could you look at the proposal in main part of issue about data packages online and let me know if this solves your requirements