htm-community / nupic.visualizations

Web application for interactive graphs, anomaly highlighting and online monitoring.
MIT License
17 stars 11 forks source link

I want to be able to visualize (streaming) data online, as the (NuPIC) model is running #17

Open breznak opened 8 years ago

breznak commented 8 years ago

We've discussed this in the initial issue, there seem to be 2 approaches:

I find the latter better for 2 reasons: it does not tie us to NuPIC only, works for any updated CSV file, and secondly would not require (complex) changes to NuPIC ModelRunner framework.

UI changes to enable this could be:

Blocked by: #16, #61

rhyolight commented 8 years ago

:+1: :100:

jefffohl commented 8 years ago

@rhyolight looking forward to getting back to this project soon, as soon as I finish up some other responsibilities.

breznak commented 8 years ago

So, with the 2 standing PRs, the speed bottleneck should be somewhat resolved and internal support for streaming is in place.

What is left is a mechanism to monitor updates to the data file (eg. periodically check the size) and update (only) with the newly added chunk of data (ideally a non-polling mechanism but on-request/update). I think the OSs do this well. I've checked with upstream and it's a known problem with no ideal solution: https://github.com/mholt/PapaParse/issues/49#issuecomment-163164936

These are the ideas that we have collected:

Any idea, preference about these/other options?

brev commented 8 years ago

@breznak @jefffohl @rhyolight This is rad, thanks for the hard work.

jefffohl commented 8 years ago

Thanks @brev !

breznak commented 8 years ago

TY @brev ! Would be nice to get your feedback and possible use-cases, if you like :)

breznak commented 8 years ago

Upcoming fix from Jeff for #56 further ensures the speed is OK.

breznak commented 8 years ago

@jefffohl with fixes in #64 and #66 I'd like to continue working on this functionality.

jefffohl commented 8 years ago

@breznak - I was imagining that we could periodically check the file to see if it has been modified, not actually read the file. If the file has been modified, then read.

Note also that for windowing, there are two things to be aware of:

  1. The file size limit is what is used to determine if windowing will be used or not. Right now, that limit is set to 5MB. We can add a feature that allows this to be manually set.
  2. The number of rows in the window is not related to the file size. Right now, the window buffer size is set to 10,000 rows.

The reason that the file size is not explicitly related to the number of rows in the buffer, is that we need to decide whether to window or not before we know how many rows there are.

breznak commented 8 years ago

I was imagining that we could periodically check the file to see if it has been modified, not actually read the file. If the file has been modified, then read.

yes, i think that's the idea. Will this work for remote files as well? (although it doesn't have to be supported since start) I think I saw some code to get a file size, that would be what we want, I guess?

The number of rows in the window is not related to the file size. Right now, it is set to 10,000 rows.

I know, OK I think.

The file size limit is what is used to determine if windowing will be used or not. We can add a feature that allows this to be manually set.

I think it can stay that way, just the monitoring will switch to windowing if needed.

jefffohl commented 8 years ago

Most servers should send back a "Last-Modified" header, so we could check that for remote servers. We can also just check the size (which we are already doing), and if that has changed, assume that new data has been added.