IBMStreams / administration

Umbrella project for the IBMStreams organization. This project will be used for the management of the individual projects within the IBMStreams organization.
Other
19 stars 10 forks source link

Proposal: streamsx.visualizatioin project #59

Closed chanskw closed 9 years ago

chanskw commented 9 years ago

I am proposing that we create a project for data visualization.

The goal of the project is to help user visualize data from a Streams application more easily.

Initially, I would like to add the mapviewer operator from the samples repository to this project, so that the operator is more reusable across all projects.

I would also like to explore visualizing time series data using various charting java script APIs, like chart.js.

If there are other ideas on data visualization, please let me know.

ddebrunner commented 9 years ago

Fyi, I've been adding default visualization to the HTTPTupleView operator, including mapping based upon the Map viewer sample.

chanskw commented 9 years ago

Do you think that's where all visualization should go? in the inet project? I was thinking that we should have a visualization project... but it could be built on top of the inet project.

ddebrunner commented 9 years ago

My goal is to have the HTTPTupleView operator provide out of the box useful visualization, good enough to get an initial sense of the data.

Currently, for example, it has live tables, that present the data in a useful way, eg int64 reportTime is assumed to be a date rather than a number.

This makes it simple, all a developer needs to do is to connect a stream to the operator.

So for example if the stream had id, latitude and longitude it can be mapped, optionally if it has marker or note then the specific marker and popup is used. All derived from that useful sample.

It would be good to have graphs also added to the toolkit.

chanskw commented 9 years ago

Just another thought... The Inet toolkit to me is an adapter project. Would it make sense to just create a new toolkit with HTTPTupleView operator.. and then we add all the visualization in this new project?

We may also want to move the tuple injection operators to a new toolkit?

Then we have a project for ingesting and sending data out. Another toolkit for data visualization and tuple injection.

ddebrunner commented 9 years ago

There may be a case for a visualization toolkit, maybe more specific visualizations, but I think they might end up in the the relevant toolkit, e.g. vitalization for transportation data would be in streamsx.transportation.

Also, maybe if the visualization used a local copy of a JavaScript library then it would make sense for it to be in the visualization toolkit, rather than force every application using streamsx.inet to also include a number of JavaScript libraries.

So, HTTPTupleView only uses Dojo (provided in the Streams install) locally and other JavaScript libraries are used remotely, which then requires internet connectivity.

Another option for the visualization toolkit, is generic composites that process a stream and then visualize it, e.g. calculating max,min,average of values across multiple time periods and providing graphs of those. I know some users have discussed this type of approach.

ddebrunner commented 9 years ago

Tuple injection is an adapter, it's one way to ingest data.

mikespicer commented 9 years ago

I also tend to think of the inet toolkit as adapters/connectors. Having a toolkit that is for visualization would make it more obvious for users looking for visualizations. Domain specific visualization could also be in specific toolkits but many are generic. +1 for a visualization toolkit.

leongor commented 9 years ago

+1

-----Original Message----- From: "Mike Spicer" notifications@github.com Sent: ‎17/‎04/‎2015 19:26 To: "IBMStreams/administration" administration@noreply.github.com Subject: Re: [administration] Proposal: streamsx.visualizatioin project (#59)

I also tend to think of the inet toolkit as adapters/connectors. Having a toolkit that is for visualization would make it more obvious for users looking for visualizations. Domain specific visualization could also be in specific toolkits but many are generic. +1 for a visualization toolkit. — Reply to this email directly or view it on GitHub.

siegenth commented 9 years ago

What would visualization toolkit be composed of? A collection of javascipt? some html pages?

I like Dan's inet rendering, it's an obvious rough draft. If you want to make it presentable, step-up, pull in your favorite javascript. You can look at the data (json) and figure out how your going to render it.

To me the word 'visualization' implies significant features, the web has set the expectations high on that word.

I'd go for a set of examples of showing how to interface the adapters to inet, mqtt and websocket to html.

chanskw commented 9 years ago

No, it's not just a collection of javascript and HTML pages.

The initial contribution contains of a map viewer that allows user to view geospatial data on the map. This is something Dan is working on incorporating the work into the Inet toolkit.

Here's a link of what that visualization looks like: https://developer.ibm.com/streamsdev/2015/03/05/visualizing-location-data-streaming-application/

The goal of the project is to provide a toolkit to easily visualize data from Streaming application. We started with visualizing location data. I also want to explore other charting javascript that provides nice charts to our data. (e.g. http://www.chartjs.org/)

Examples are good to show user how to do certain things. But it is hard to integrate with any application, and hard to find. My vision is a set of "sink" operators that people can just send data in and then see results in browser.

leongor commented 9 years ago

I've managed once to update Google maps using Node.js / Socket.IO

Socket.IO enables real-time bidirectional event-based communication. It works on every platform, browser or device, focusing equally on reliability and speed.

It uses websockets as a main protocol, but failovers to other realtime protocols if webSockets are not supported on the browser. Another nice feature - it can work in pubsub mode, so multiple browsers (subscribers) were connected to the same stream (publisher). So Streams was pushing data via Node.js to a browser and updated Google maps - no polling!

chanskw commented 9 years ago

That's cool! will look into this too.

So, here's the proposal on what we will do:

Repository Name: streamsx.visualization Toolkit Name: com.ibm.streamsx.visualization

Initial contribution include:

I will leave this open until end of the week. Please let me know if there is any objection to this proposal. Thanks!

ddebrunner commented 9 years ago

-1 on moving HTTPTupleView, it's part of the inet toolkit, it exposes streaming data using HTTP and JSON, it can be used for visualization, but can also be used for non-visualization. It's also closely tied with the other rest operators in the inet toolkit.

chanskw commented 9 years ago

Dan, so you ok with putting the map viewer code to the visualization project, but leaving the HTTPTupleViewer operator in the inet toolkit? And the direction is that any additional visualization work we do, we will try to put in the visualization project.

ddebrunner commented 9 years ago

Any reason both can't exist, HTTPTupleView does its best to provide some default visualization for any of its connected streams, while the visualization toolkit provides more involved visualization? E.g. a composite that runs multiple aggregations on the streaming data before graphing it (e.g. live plot and trend lines).

Just trying to make it easy for users by having a single operator provide common simple views, without having to pre-select what operator to sink the data to.

I think I was thinking of explicit operators for visualization, before I realised that HTTPTupleView could provide simple outt of the box as well, that's why the meta data was added.

chanskw commented 9 years ago

Here are the reasons why I think we need a visualization toolkit: 1) having visualization done in Inet is not obvious for users. It is not a place where people will normally look for code to help them with visualzation. Inet toolkit to me is an adapter toolkit, not something for visualization. This is also the reason why I proposed to move HTTPTupleViewer out. Is this operator truly an adapter? Adapter to me is an operator that writes to an external system, not for viewing things. 2) I want to add fancy visualization into the project eventually. I am trying to avoid getting the Inet toolkit bloated with things that is not adapter-based.

The main thing is... I think it's not obvious for visualization to be in the Inet toolkit, and want to have a more easily discoverable project for it. You can certainly argue that the HTTPTupleViewer is a sink operator and it is also for visualization. So, the question is, which of its capabilities do you want to advertise more? Like you said, you wanted this operator to provide the best out-of-the-box visualization experience.

chanskw commented 9 years ago

... and to clarify... It's not that I don't think both cannot exist. I am thinking that as a general direction, visualization should be done in the visualization project. And I still think the operator should be moved to the visualization project. :) This project is for general visualization support. Having one place is certainly easier than having things in multiple places. But I do agree that domain specific visualization should be done in the domain-specific toolkit.

hildrum commented 9 years ago

+1 on the idea of a visualizations repository. IMO, HTTPTupleView and HTTPTupleInject belong together, and I think it makes sense to keep them in the Inet toolkit. One question @chanskw -- are you think of this as a repository of examples of visualization, toolkits for visualization, or both? I think including both makes sense, as someone wanting to visualize their data is likely going to want to look at many samples and then modify to use for their scenario.
Also, once it's created, it'd be great if we could have more than the usual simple spl doc page on github.io--would it be possible to have some screen shots from visualizations?

ddebrunner commented 9 years ago

SPLDOC can include images, I think.

ddebrunner commented 9 years ago

+1 on the idea of a visualizations repository, though @siegenth does have a point, we need to not over-sell it as a visualization solution, given the quality people now expect from visualization

-1 on moving HTTPTupleView out of inet, it's a popular operator, I don't think we should deprecate it. While it is named HTTPTupleView, it does provide HTTP REST access to streaming data, and thus can be used for non-visualization examples, it's similar in ways to the websocket operator, access to streaming data through a non-streams api.

+1 on allowing HTTPTupleView to have simple (not best) out of the box visualization

chanskw commented 9 years ago

+1 on proposal. I think we are in agreement.
Thanks for all your input.

jchailloux commented 9 years ago

+1 for the visualization repository

-1 for moving the HTTPTuple. HTPP is an inet protocol that must not move out of inet. From my point of view it will break the logic.

jchailloux commented 9 years ago

We can consider in the future (if possible) to split the HTTPTUpleView into 2 parts one in the inet and the other one in the visualization as a sample to let others use their own WebServer.

ddebrunner commented 9 years ago

@jchailloux The purpose of HTTPTupleView also providing visualization is to allow customers to quickly visualize their data, even if it's been sent (pulled from) to another system. Separating out the visualization takes that ability away.It's not trying to provide excellent visualization, just enough to easily see what the data looks like.

chanskw commented 9 years ago

Closing this for now. Do not have resources to move forward with this contribution. Will contribute when I have more time. Thanks!