WASHNote / WASHWeb-2019

Exploration of how to link WASH enabling environment data and more. Moved to: https://git.washnote.org/WASHWeb/WASHWeb-2019
https://washweb.org
2 stars 0 forks source link

Data stores, import pipelines and pre-baked front-ends #1

Closed decentral1se closed 5 years ago

decentral1se commented 5 years ago

ACTION: @lwm to investigate data stores, import pipelines and pre-baked front-ends

  • The data store should do the heavy lifting for us (wrt. relationships)
  • To move fast, we just want some existing front-end for exploring data
  • Overall, let's reuse what is there where possible. Simplicity wins
decentral1se commented 5 years ago

Graph Databases

I took a recommendationt to watch "Graph Databases Will Change Your Freakin' Life".

The main takeaway point here is:

If you're asking questions about relationships, they are really nice and really powerful and sometimes seem like witchcraft

So, the question is, what relationships, direct or indirect do we have an intuition for. Or, following the style of the video, what questions can we ask that involve the relationships. If we have some of those, then we can go further. This is what https://github.com/nickdickinson/WASHWeb/issues/2 is about.

https://neo4j.com/ appears to be still the most used graph database.

Importing Pipelines

Neo4J has wide Python support, as I can see from https://neo4j.com/developer/python/ which seem to provide a nice programming interface which allows us to avoid writing directly the Cypher queries. So, where we need to run import pipelines to the database, we are covered. There also seems to be a straight spreadsheet -> graph database importer which looks interesting.

Front-ends

Neo4j provides a visualiser for queries straight out of the box.

That's probably enough just to see what we can get out of what we put in!

Here's a preview image:

image

nickdickinson commented 5 years ago

Ok, I'm sold on Neo4j as it is so standard. However, I don't think all our data will per se be graph data. We'll probably need to be able to support different types of data sets. In particular, I don't think the average person wants to see the graph output directly, but will rather want some easy way of exploring without using a query language.

This is an interesting way of pulling cypher queries to a neo4j tool into SQL for analytic purposes and may support some use cases: https://www.bsimard.com/Connecting-Neo4j-to-any-SQL-tool-with-the-power-of-Postgresql

I also liked the idea of AgensGraph because it supports both SQL and Cypher so you can mix ways of working. Could also be interesting to look at. It also has some kind of browser,

Let's keep these in mind as we refine our ideas about the data models, workflows and interface.

decentral1se commented 5 years ago

OK, closing this one out until we gather more steam again!

(cleaning out my issue tracking list ...)