numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.33k stars 1.56k forks source link

Visualize, plot OPF experiment results #2658

Closed breznak closed 8 years ago

breznak commented 8 years ago

I'd like to visualize the results .csv from running OPF experiment. It should be relatively easy, but perhaps some of you have already written a nice utility for that? NAB, @rhyolight ?


WORKING branch: https://github.com/breznak/nupic/tree/plot_results

TODO:


UPDATES:

rhyolight commented 8 years ago

All my tools have been in matplotlib, which makes me want to die. But this could be updated to do even more generic plotting? https://github.com/numenta/nupic/blob/master/examples/opf/clients/hotgym/anomaly/one_gym/nupic_anomaly_output.py

breznak commented 8 years ago

Thanks, I'll check your script soon.

Meanwhile I've tested the in browser plotting on NAB and it looks awesome :+1: Maybe we could bring in over to nupic and just edit it to parse OPF result files ?

NAB has also a plotter using https://plot.ly/ which I haven't tested yet (need to set up an API key)

@rhyolight who's maintaining the plotting scripts for NAB?

breznak commented 8 years ago

CC @subutai @BoltzmannBrain @dundalek Hey! It's me, Mark :wink: Can you suggest a nice (and simple) JS lib for visualizing CSV graph data?

rhyolight commented 8 years ago

I'm a big fan of http://dygraphs.com/. That's what River View uses.


Matt Taylor OS Community Flag-Bearer Numenta

On Fri, Oct 9, 2015 at 2:22 PM, breznak notifications@github.com wrote:

CC @subutai https://github.com/subutai @BoltzmannBrain https://github.com/BoltzmannBrain @dundalek https://github.com/dundalek Hey! It's me, Mark [image: :wink:] Can you suggest a nice (and simple) JS lib for visualizing CSV graph data?

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/2658#issuecomment-146988839.

BoltzmannBrain commented 8 years ago

@breznak plotly has some great functionality for creating visually-appealing graphs, and for free. We use it in NAB here. Here's an example from that script, where the green and red diamonds mark TP and FP detections, respectively, and the red dots are the true anomaly labels. And if you navigate to the "Code" tab, plotly will give you all the code to generate the plot :smile:

breznak commented 8 years ago

@rhyolight @BoltzmannBrain Yes, I've edited the issue with both. DyGraphs are used in 1 approach. The plot.ly graph looks really pretty, but the fact it publishes all the data you plot (unless you pay) is a blocker for some of our purposes - ei. we can't publish the data.

rhyolight commented 8 years ago

So what secret supervillain project are you working on?


Matt Taylor OS Community Flag-Bearer Numenta

On Fri, Oct 9, 2015 at 3:46 PM, breznak notifications@github.com wrote:

@rhyolight https://github.com/rhyolight @BoltzmannBrain https://github.com/BoltzmannBrain Yes, I've edited the issue with both. DyGraphs are used in 1 approach. The plot.ly graph looks really pretty, but the fact it publishes all the data you plot (unless you pay) is a blocker for some of our purposes - ei. we can't publish the data.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/2658#issuecomment-147005271.

breznak commented 8 years ago

So what secret supervillain project are you working on?

Nah, Skynet and stuff. :wink: And some medical records..

breznak commented 8 years ago

I've updated my initial steps in the working branch https://github.com/breznak/nupic/tree/plot_results I'm looking for a JS/DyGraph wizard who would be so kind in helping me with this issue :sos: as I think it'll be helpful to have for the upcoming HTM Challenge. ( @rhyolight @jefffohl ?)

jefffohl commented 8 years ago

I haven't tinkered with DyGraph before, but it sounds like the kind of thing that is up my alley. I might be able to clear out some time later this week to take a look.

breznak commented 8 years ago

Jeff, that would be awesome! I'll prepare today some example data so it's in a runable form. Thanks a bunch

jefffohl commented 8 years ago

@breznak - will this be used in your HTM Challenge submission? Or is it simply an enhancement to NuPIC that everyone can use? If the former, I probably shouldn't help out, unfortunately, because I am on the HTM Challenge judging panel, and it would create a conflict of interest.

@rhyolight - thoughts here?

rhyolight commented 8 years ago

@jefffohl If you are doing work that contributes to the NuPIC codebase, that is fine. Just don't help out by contributing to anyone's private project repos.

jefffohl commented 8 years ago

OK - thanks @rhyolight

breznak commented 8 years ago

@breznak https://github.com/breznak - will this be used in your HTM Challenge submission? Or is it simply an enhancement to NuPIC that everyone can use? If the former, I probably shouldn't help out, unfortunately, because I am on the HTM Challenge judging panel, and it would create a conflict of interest.

@jefffohl No worries, I haven't joined a Challenge project yet. This is intended for general Nupic enhancement, but I'd love to make it asap before the challenge as I see it could be useful to some people during their work on HTM challenge projects.

jefffohl commented 8 years ago

@breznak - OK, sounds great. Just trying to be careful about my responsibilities.

breznak commented 8 years ago

I've updated the TODO steps and ordered them sequentially

subutai commented 8 years ago

Meanwhile I've tested the in browser plotting on NAB and it looks awesome Maybe we could bring in over to nupic and just edit it to parse OPF result files ?

No one is really maintaining this right now. It does look it doesn't plot the results files correctly anymore. It would be nice if someone could fix that. The UI is rather cumbersome too. It does work fine for examining the raw data files.

jefffohl commented 8 years ago

Working on this now. First, I am evaluating some different plotting libraries. Let me know if anyone is tied to DyGraphs.

jefffohl commented 8 years ago

@rhyolight - when you were choosing a data visualization tool for River View, did you take a look at D3.js?

I am kind of leaning towards D3.js because it is more of a low-level framework, and therefore is more flexible. It occurs to me that, although we are solving a specific problem here, we are likely to want to continue to build various ways to visualize various things - therefore having a framework (rather than a tool) might be a better choice.

Does anyone have strong opinions here?

breznak commented 8 years ago

@jefffohl no strong opinions, as he who writes the code decides :wink: And both frameworks look really pretty to my amateur eye. But I don't see what else than OPF result graphs would we want to visualize in nupic? Remember there is the NuStudio to show you graphical representations of HTM and its parts..

rhyolight commented 8 years ago

I have using D3 directly in the past and it is an excellent library.

Sent from my MegaPhone

On Oct 16, 2015, at 4:14 PM, Jeff Fohl notifications@github.com wrote:

@rhyolight - when you were choosing a data visualization tool for River View, did you take a look at D3.js?

I am kind of leaning towards D3.js because it is more of a low-level framework, and therefore is more flexible. It occurs to me that, although we are solving a specific problem here, we are likely to want to continue to build various ways to visualize various things - therefore having a framework (rather than a tool) might be a better choice.

Does anyone have strong opinions here?

— Reply to this email directly or view it on GitHub.

breznak commented 8 years ago

@jefffohl I'll probably start looking into this too, can you please share if you had some progress? And if/why do you see D3.js better than DyGraph?

jefffohl commented 8 years ago

@breznak - sorry about my slow progress - I have to squeeze this work in with my day job :).

I was considering D3.js only because it is more of a framework, so that might serve us better in the future, if we intend to do more. But, for now, it seems that DyGraph will probably be the easiest to work with, so that seems the best bet at the moment. If we want to get more fancy later, we can.

I haven't done much except for reviewing charting frameworks, and getting up to speed with NAB.

I was going to spend some time today working on this, but if you want to divide up tasks, we could do that too.

breznak commented 8 years ago

Thanks Jeff, no worries. I just wanted to use your experience to know which framework I should start learning about..so dygraph it is for now. I will be learning my ways around, and if I started doing some real work, I'll sync with you here so we don't duplicate.

PS: can I help something with NAB? But I think we don't need it here..(?)

jefffohl commented 8 years ago

@breznak - ok. I am going to keep at it, so let me know if you start to work on any of the issues on your checklist.

Regarding NAB - I just wanted to see how NAB fit into OPF - so not intending to use it, just wanted to understand better where you are coming from in terms of what your needs are.

jefffohl commented 8 years ago

@breznak I am making some progress - I might have something to share tomorrow.

jefffohl commented 8 years ago

HI @breznak. I have put together something. Right now, it is an interface that shows two panes: data and results. For each, there is a select menu that allows the user to select which CSV file they want to plot.

Is this what you are imagining, or are you thinking that the data should be merged into one graph?

I can make a pull request if you want to see the work in progress.

jefffohl commented 8 years ago

Here is a screen shot for reference: screen shot 2015-10-23 at 1 35 23 pm

breznak commented 8 years ago

Hi @jefffohl , it looks really nice! :+1: :smile: Please make a PR, or tell me your branch, I'd love to test it out.

cogmission commented 8 years ago

Agreed, it DOES look nice!

On Fri, Oct 23, 2015 at 4:57 PM, breznak notifications@github.com wrote:

Hi @jefffohl https://github.com/jefffohl , it looks really nice! [image: :+1:] [image: :smile:] Please make a PR, or tell me your branch, I'd love to test it out.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/2658#issuecomment-150702560.

With kind regards,

David Ray Java Solutions Architect

Cortical.io http://cortical.io/ Sponsor of: HTM.java https://github.com/numenta/htm.java

d.ray@cortical.io http://cortical.io

jefffohl commented 8 years ago

Here you go: https://github.com/breznak/nupic/pull/9

breznak commented 8 years ago

A couple of questions/ideas to discuss:

jefffohl commented 8 years ago
breznak commented 8 years ago

btw @jefffohl I've added some UI questions (designated by "UI") to the issue description, esp. fixing the anomalyScore scale would be nice. Now back to replying to your comment...

breznak commented 8 years ago
  • I can code it up to merge the data and results data sets. The only question I have is, how to make sure we are loading the proper files? Will the paths to the data and the results always have reliable relative paths? Or, should we let the user choose a data file (based on a manifest file as we have now), then a results file, then merge them together after they have been selected?

The file lists are used in NAB which relies on the structure of the data, so I think it's safe to assume the paths will be correct. Actually NAB has an alternate plotting through plot.ly so I would not worry about that functionality too much. And your commit now fixes the issue for NAB where plotting of results didn't work.

...if we focus on NuPIC/OPF, there are no separate data/results file lists. The results are in a single file (and raw data are included under the field actual).

breznak commented 8 years ago
  • the checkboxes are dynamically generated based on the CSV header. so, they will work with any CSV.

:guitar: :+1:

breznak commented 8 years ago
  • The DyGraph CSV parser choked on the OPF format. I can write a Javascript parser for that. I am assuming that we want it to strip out all fields that are not a number?

yes, but..

Since the second line indicates the data type, this should be relatively easy. Or do we want to somehow convert some of the string values to numbers in some way? Note that DyGraph can only plot numbers (integers or floats).

Plotting only numbers is OK, the data-type line (2nd) would not work that easily, as for the example OPF file here the anomalyScore is type string:

  1. this is a NuPIC bug and we should ignore it and require it fixed ( @rhyolight ? I think that would be a correct approach)
  2. hack around, my parser did well with replacing {.*} or [] to 0;

PS: optionally, in the future we may want to get some of the other string fields, eg to get multiStepBestPredictions.1,multiStepBestPredictions.5 (which are in fact again floats)

breznak commented 8 years ago
  • You can zoom into the graph by using your cursor to select a section of the graph. Though, I just noticed that I introduced a bug here. Normally, double-clicking will return the graph to the zoomed-out state. But I accidentally disabled that with some code that displays the timestamp when you click on the graph. I will fix that.

Cool thanks, I've noticed the zoom-out problem. Btw, do you think it would be possible to do some "smooth zoom-out" (the way zoom-in works)? With a mouse-wheel, a scroll-bar, ...?

breznak commented 8 years ago
  • Unfortunately, I am ignorant about annotated anomaly labels. Can you elaborate on that point?

This exists only in NAB so far *). Annotations is a TXT file with a vector, where 1 means a human-annotated anomaly = nice to plot with a "stem". Maybe another "Choose annotations file:" file input field would work fine here? A nice to have but definitely not needed right now.

*) @rhyolight would it be any problem/benefit getting an anomalyAnnotation field to OPF? Useful here and then NAB could use OPF files, instead of its custom. I've raised that issue there but got no opinions..

breznak commented 8 years ago

@jefffohl please let me know if there's anything I can help, with a non-JS stuff.

jefffohl commented 8 years ago
jefffohl commented 8 years ago

@jefffohl please let me know if there's anything I can help, with a non-JS stuff.

Will do. Thanks!

breznak commented 8 years ago
  • In determining how to let the user choose what OPF file to use, all of those sound like good suggestions. It depends mostly on how users typically use the OPF, and about this I am somewhat ignorant. The most universally easy solution would be to launch a file browser from the web UI. We could also allow users to put in a path manually, which could be a publicly accessible URL. So, maybe there would be a "path" field, where the user types in the path, and a button for launching a file system browser for finding and loading local files.

I think a typical use case is a user running a single OPF experiment and wants to see the results. So a a file browser from the web UI. sounds like the way to go to me. Alternatively later we could add behavior like Gmail attachements, "Add a file" file-chooser and then checkboxes for which files to plot.

breznak commented 8 years ago
  • For parsing OPF files, would it be OK to hard-code into our script exceptions for anomalyScore, multiStepBestPredictions.1, and multiStepBestPredictions.5 and the like? That would make it pretty easy, as long as we can count on those fields always being number-like (we can use type coercion to turn them into numbers). It makes things more tightly coupled, but so far, this would only be for OPF, so that should be OK, right?

Agree, simple & works.

I still like your approach to plotting all numeric fields (as input data will be plotted if it is in a suitable form), are you planning combining the 2 approaches?

breznak commented 8 years ago
  • For smooth zooming, it don't see an option for that in DyGraphs, but I will look around, and I may be able to add a plugin for that as well.

(btw, just to make myself clear, I think the zoom-in with mouse selection is perfectly viable, just missing a zoom out step).

I just found a mention of https://code.google.com/p/dygraphs/issues/detail?id=366 in Issue 58 some calls could exist (?)

breznak commented 8 years ago
  • Will worry about the anomalyAnnotation later. I think what I will end up having is two different versions of this script - though working in the same way - one for NAB, and one for OPF.

100% agreed. Personally I'd try to push NAB to switch to using OPF...

BoltzmannBrain commented 8 years ago

@breznak could you please elaborate on what you mean by pushing NAB to use OPF?

breznak commented 8 years ago

@breznak could you please elaborate on what you mean by pushing NAB to use OPF?

@BoltzmannBrain bad wording, sorry. I had an issue in NAB whether it could use OPF for its format (extended with a annotations column), so this code could be shared (eg the results plotting is fixed here)

rhyolight commented 8 years ago

Guys, this issues is just for plotting experiment results after they've been run, right? There is no initiative to visualize live predictions / anomalies coming out of NuPIC, is there?