htm-community / nupic.visualizations

Web application for interactive graphs, anomaly highlighting and online monitoring.
MIT License
17 stars 11 forks source link

Streaming CSV processing/plotting #27

Closed breznak closed 8 years ago

breznak commented 8 years ago

Fixes #26

jefffohl commented 8 years ago

@breznak - I ran a test of this, and it did not seem to work. When I tried to load a file nothing happens, and I got a lot of stuff logged to the console.

breznak commented 8 years ago

I'm not sure what changed (broken merge, the newer libs) but I'm getting this error too. Will try to investigate.

breznak commented 8 years ago

This needs some more love, but I got it working.

When ready, this could be a good way to #17

breznak commented 8 years ago

@jefffohl @rhyolight can you take a look at c2c14a7 please? I want to load a small(er) batch, render it quickly, repeate. But I don't know how exactly to redraw the graph? Also, do DyGraphs support something like updates? That I keep what is rendered (just shrink it) and append a few of new data?

jefffohl commented 8 years ago

@breznak You can update the graph data by calling the updateOptions function. You can see this being called here: https://github.com/nupic-community/nupic.visualizations/blob/master/client/src/app/appCtrl.js#L192

I can start working on this if you would like - but I don't want to work in parallel with you - so let me know.

jefffohl commented 8 years ago

FYI - here is the Dygraphs API reference: http://dygraphs.com/jsdoc/symbols/Dygraph.html

breznak commented 8 years ago

I can start working on this if you would like - but I don't want to work in parallel with you - so let me know.

That would be great @jefffohl ! I'll take a break and sleep now :) DyGraph.updateOption looks like what we need - called from Papa.step() with the small batch payload. Would be awesome if we can get the "streaming data" done! For now I think we could assume NuPIC runs slower than the data appears, so we don't have to bother with polling and sleeps.

jefffohl commented 8 years ago

@breznak - I started on this today, but didn't get too far yet. I will work on it again tomorrow, unless you are eager to jump back in. If so - let me know.

jefffohl commented 8 years ago

@breznak - due to requests from @rhyolight i am prioritizing #40. i still plan on working on this though. let me know your thoughts.

jefffohl commented 8 years ago

@breznak PR #54 could possibly replace this. Thoughts?

breznak commented 8 years ago

@jefffohl thanks for #54 (you made a PR to merge into this PR, not master, which confused me..sorry.) Anyway, this is your work from #54. I've added a new example file examples/CSV/huge_2M.csv and it uncovered a bug in the current implementation:

jefffohl commented 8 years ago

@breznak - ok, taking a look.

jefffohl commented 8 years ago

@breznak - this is happening only for files without a timestamp. Working on a fix.

One other thing I noticed with very large files is that there is a limit as to what DyGraphs can handle. My system locked up trying to render the 60MB file. For files this large, it seems we will need to figure out a way to enforce windowing or something.

breznak commented 8 years ago

this is happening only for files without a timestamp. Working on a fix.

Thanks @jefffohl ! I've noticed another thing, when the timestamp is not monotonic the graph is rendered over the exiting values (i think that is an OK behavior, as we expect time to be monotonically increasing only)- but this happens when you append 2 files, maybe we could raise a warning "Your timestamp is not monothonic." ?

My system locked up trying to render the 60MB file. For files this large, it seems we will need to figure out a way to enforce windowing or something.

  • maybe one approach could be #43
  • we already have such functionality - the appConfig.SLIDING_WINDOW, so we can set some var like MAX_POINTS and if the actual points is >, just change the sliding window and switch from "append" to "slide" mode? (also trigger a warning on this occasion?)
jefffohl commented 8 years ago

I made a new PR to the speedup_csv_parse branch from my version of that branch, which fixes the problem of series without timestamps overwriting the iteration value.

jefffohl commented 8 years ago

@breznak - I am going to merge this into master, if that is OK with you. For your recent comments regarding enforced windowing, and monotonic alerts, lets make some new issues for those.

breznak commented 8 years ago

Thank you Jeff! merged the branch and I'll make the issues.