Closed netsensei closed 8 years ago
Hey @netsensei! Great to hear that you heard of GraphQL, I've looked into it as soon as the spec was published on the Web and also shared the same ideas you so eloquently wrote. I might return to the spec see how it has evolved. I find this feature very interesting and logical development (not nec. GraphQL) in the datatank developments.
Why?
The Datatank has always been centered around user needs. At first we build it because CKAN was lacking a data re-use focus (and still sorta is), so built a DCAT compliant data publishing tool, which is now the framework/app called tdt/core! Aside from the we also started developing tdt/input, an ETL package that allows you to load data from datasources (any supported tdt/core datasource) into a NoSQL store (MongoDB atm).
Why?
Because once governments started using our platform they were quite happy with the platform as it was, however, now people/businesses actually want to start using the data. The ability to use datasets that need processing before anything can be done with it (even if it's a small query, sort, count, ...) are a number one request from all of our stakeholders. This now leans towards open services (cc @pietercolpaert ) and provides some small challenges (rate limiting e.g.) but is in my regard a necessary next step we need to take.
Will this be in the 6.0 release?
Sorta, the MongoDB datasource has been made and I'm currently looking into what operations we might support in order to not allow a datatank to be taken down by firing complex queries. In 2016 however, I'm really willing to develop this as to making open data more re-usable in a flexible way. If this can be done by using GraphQL or any other popular (pref. open) standard, the better. :)
Small question, what makes a standard open? Once it's being collaborated upon by different stakeholders? Link for reference.
As I've understood from all of this, we're trying to reinvent 2 things:
Indeed, it can be seen as a form of SPECTQL but more standardized and free in structure. As I remember it you can see it as a form of construct but on raw data. This is actually a "super" nice to have, meaning that it exceeds our scope. I think it's useful to have a discussion about making data useful through the datatank but for now I will close this issue.
Due note that with the inclusion of tdt/input we now also provide a way to build services from your data. Not to push services per se, but to allow for larger datasets to be published as well in a usable way.
:+1:
Feature request proposal: The Datatank should support GraphQL.
What is graphQL?
It's a declarative query language that allows applications to query endpoints using graph based, JSON formatted queries. It tries to step over the limitations that come with a REST API (chatty nature, scalability, etc.) It does this by introducing "app schema's" on the server side which describe the data model of the responses, thus limiting the extent of what can be queried. The premise is that you create schema's from a product owner/creator view: a GraphQL based endpoint is thus geared towards app specific implementations.
More info: https://facebook.github.io/react/blog/2015/05/01/graphql-introduction.html
GraphQL is developed by Facebook within the context of the React.js and Relay frameworks.
How could the datatank benefit?
The datatank allows a data owner to easily set up an API based on whatever data is inputted. Given various sets of associated data (ie. in CSV format) you can easily set up REST endpoint which exposed the data through several HTTP calls.
GraphQL could be added so that instead of creating several HTTP endpoints, you only have to create 1 endpoint which can respond to a GraphQL query with response composed of a combined dataset.
How does this differ from linked data?
The main difference is that GraphQL is app centric while SPARQL is a very generalist approach to querying data. The very nature of LOD is that it is not confined within a single repository. GraphQL is not federated.
So, why would this be a good idea?
The datatank can also be used to setup an endpoint for specific app(s) & use cases, apart from open publishing of datasets on the web for reuse by 3rd parties.
What are the risks?
GraphQL is still a very young technology and it is not an open standard. Given the current popularity of React.JS in the javascript community and the global trend of moving to decoupled apps (rendering DOM in the client, thus adding application like behaviour to websites), GraphQL could conceivably gain traction in the near future. However, it is currently uncertain wether it has a chance of becoming an established technology. Really depends on the rate of adoption.
Drupal 8 heavily gears towards a REST based / decoupled approach of publishing data on the web. There are currently already experimental modules for D8 that bring GraphQL to the framework.