snowch / biginsight-examples

Example projects to help you quickly get started with BigInsights
Apache License 2.0
7 stars 4 forks source link

Create example for IBM Graph service integration #49

Open snowch opened 8 years ago

snowch commented 8 years ago

This is a placeholder task for creating an example integration with CDS Graph. Some approaches were discussed offline at a high level with @ukmadlz

If you want to contribute towards this - please add a comment.

KimStebel commented 8 years ago

I might be interested. What approaches did you talk about?

snowch commented 8 years ago

@KimStebel - as a starting point create a java application that can perform one-off exports of some data on the cluster (e.g. hdfs) into graph. Maybe it would help to use spark because of the uniform access to many different data sources into a spark dataframe. From the dataframe we could do something like collect() to loop through the data and export to graph (in a single-thread).

snowch commented 8 years ago

@KimStebel - still interested? :)

KimStebel commented 8 years ago

Importing larger datasets into graph isn't trivial at the moment, so that would actually be a good example. I'm afraid we wouldn't be able to use the bulk import endpoint and would instead have to use gremlin queries. I can certainly help with that, but I'm not sure how much time I'll have.

snowch commented 8 years ago

@KimStebel - we don't have any specific time pressures on this. Any help you can give will be appreciated. Let me know if you need access to a cluster to develop this.

KimStebel commented 8 years ago

Yes, I would indeed need access to a cluster. Also, what about getting data out of graph? I'm still a bit confused whether Graph will be positioned more towards the OLAP or the OLTP side of things, but query times limited to 60 seconds suggest OLTP. So getting data out of graph and into something that does longer running processing jobs (spark, hadoop) seems important, too.