marl / jams

A JSON Annotated Music Specification for Reproducible MIR Research
ISC License
184 stars 26 forks source link

Pretty printing: _repr_html_ and _repr_svg_ #93

Closed bmcfee closed 7 years ago

bmcfee commented 8 years ago

It would be nice if we could have better pretty-printing of jams objects (and annotations, specifically) for notebook environments, ie, by implementing a _repr_html_() method.

justinsalamon commented 8 years ago

is there no existing code for pretty printing json files in general that we can leverage?

bmcfee commented 8 years ago

Sure, but I'm thinking something more akin to msaf's plots that render out a table for the annotation objects. Doing display(jam.annotations[i].data) is nice and all to render the data frame object, but the information could be encoded much more concisely by exploiting the regular structure of jamsframes.

A really cool version of this would combine all annotations into a single table-like widget, where each row corresponds to an annotation. This way, you could easily get a visual digest of all annotations in a compact space.

One can also imagine having different rendering helpers for different namespace types, so that events draw like events and regions draw like regions. (Even fancier, pitches and chords draw like piano roll, etc.)

justinsalamon commented 8 years ago

One can also imagine having different rendering helpers for different namespace types, so that events draw like events and regions draw like regions. (Even fancier, pitches and chords draw like piano roll, etc.)

So basically a jupyter version of sonic visualiser :P Yes, that would be cool.

bmcfee commented 8 years ago

So basically a jupyter version of sonic visualiser :P

Yup! But it could also be used outside of notebook environments, eg, for a web-based visualizer.

justinsalamon commented 8 years ago

eg, for a web-based visualizer.

I think I can see where you're going with this... and I like it.

bmcfee commented 8 years ago

Thinking a bit more about this, it might be tricky to properly handle overlapping intervals, eg:

time duration value confidence
0 5 guitar 1
2.5 2.5 vocals 1

should render as two rows that overlap by half horizontally, but

time duration value confidence
0 2.5 guitar 1
2.5 5 vocals 1

should render as one row.

Ideally, a css layout engine should do this for us, but I'm not convinced that it will work.

justinsalamon commented 8 years ago

Why should the second example render as one row? One could argue that each instrument should have a separate track (row) throughout the visualization. That's how sequencers do it at least.

On Fri, Jan 29, 2016 at 10:05 AM, Brian McFee notifications@github.com wrote:

Thinking a bit more about this, it might be tricky to properly handle overlapping intervals, eg: time duration value confidence 0 5 guitar 1 2.5 2.5 vocals 1

should render as two rows that overlap by half horizontally, but time duration value confidence 0 2.5 guitar 1 2.5 5 vocals 1

should render as one row.

Ideally, a css layout engine should do this for us, but I'm not convinced that it will work.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/93#issuecomment-176801008.

bmcfee commented 8 years ago

Why should the second example render as one row?

  1. Because it's more space-efficient
  2. Because they don't overlap in time. If you like, replace the instrument labels with chord annotations, or lyrics. (Overlapping lyric timing might correspond to different vocalists.)
  3. We shouldn't need separate formatters for disjoint or overlapping interval annotations; one formatter should be smart enough to work it out.

One could argue that each instrument should have a separate track (row) throughout the visualization.

That's not something you can easily deduce from the tabular annotations, since there are no constraints about mutual exclusivity between observations.

justinsalamon commented 8 years ago

I see your point. I can think of three potential solutions (which could all be offered, in principle):

  1. Display everything in one row, even if that forces things to overlap
  2. Scan the data first go generate a list of unique values, give each value (e.g. chord / instrument name) it's own row
  3. Try to do something clever by looking at the time/duration values to detect overlaps and only spawn as many rows as necessary to be able to display everything without overlaps in the same row.

That said, I think it's at least worth considering leveraging some of the established visualization paradigms. In the case of instrument labels, having a row per instrument would be very intuitive for most users.

On Fri, Jan 29, 2016 at 11:04 AM, Brian McFee notifications@github.com wrote:

Why should the second example render as one row?

  1. Because it's more space-efficient
  2. Because they don't overlap in time. If you like, replace the instrument labels with chord annotations, or lyrics. (Overlapping lyric timing might correspond to different vocalists.)
  3. We shouldn't need separate formatters for disjoint or overlapping interval annotations; one formatter should be smart enough to work it out.

One could argue that each instrument should have a separate track (row) throughout the visualization.

That's not something you can easily deduce from the tabular annotations, since there are no constraints about mutual exclusivity between observations.

— Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/93#issuecomment-176838143.

bmcfee commented 8 years ago

I would really prefer to avoid the following here:

The former for obvious reasons. The latter because it can complicate sharing of notebooks if you have external or cross-site loading of dependencies.

The instrument-per-row idea is fine too, but I see that as an embellishment that should come much later, once we have the basic formatting implemented.

bmcfee commented 8 years ago

I did some snooping around here, and it seems like there isn't a great way to encode this directly in html.

So we have two options:

  1. use python to render out a static document from the annotation
  2. include a fragment of javascript that can dynamically render the json representation

Neither of these sound particularly good to me, but i'm leaning toward 2 because it keeps the python code simple.

Algorithmically, I have a method in mind to detect overlapping intervals and pack them into as few parallel timelines as possible. Once that step is done, this example illustrates the kind of layout I have in mind.

bmcfee commented 8 years ago

Alternatively, we can impelement _repr_html_ on the JObject subclasses that does some sensible recursive layout of the contents, and give JamsFrame a _repr_svg_ method.

This would still have the downside of forcing the layout logic into python, but using SVG instead of CSS for the positioning would be much better. If we attach the proper attributes to the svg elements, it would still be relatively easy to hook in via javascript to have interactions in #19.

hughrawlinson commented 8 years ago

This could get really really heavy on huge jams files. I'm currently using JAMS for storing really high resolution time series spectral features. It'd be difficult to display them properly, and it'd be computationally intensive to reduce them on the client side. Maybe it'd be a good idea to restrict the kinds of annotations that can be displayed in this view?

bmcfee commented 8 years ago

I'm currently using JAMS for storing really high resolution time series spectral features. It'd be difficult to display them properly, and it'd be computationally intensive to reduce them on the client side. Maybe it'd be a good idea to restrict the kinds of annotations that can be displayed in this view?

How are you storing these features? In blobs/vectors, or one of the annotation namespaces?

In general, I think we're planning to support namespace-dependent rendering, so a "blob" annotation might simply show up as an interval with <binary data> as its label, rather than the actual contents of the blob. I think this should resolve most of the complexity issues that you're pointing to, but maybe I'm missing something? (This might fall over if you have an annotation with a zillion time-value pairs to encode temporal features, but I think we can hack around that as well.)

Otherwise, the common dense data types shouldn't pose much of a problem. E.g., rendering a pitch annotation as an svg curve might end up being large, but not insane.

@hughrawlinson I'm wondering if you have any thoughts about canvas vs svg for this stuff? I'm leaning more toward svg because it would be dom-accessible and potentially interactive down-stream. (It seems to work well for d3, anyway!)

hughrawlinson commented 8 years ago

I agree with SVG, it means you can take it out of the web context and render it in LaTeX or somewhere else that may be useful. It's also a nice format because it's easy to remove pieces in a text editor rather than having to fire up an image editor as you would have to with a rasterised image saved from a canvas.

I'm storing my data in an annotation, which may be wrong. It'd be great to discuss that at some point if you wouldn't mind, but as for generally dealing with this kind of data just putting <binary data> would probably suffice. As for curves, for some reason I was thinking about embedding javascript in an SVG (which is possible, though probably not a good idea for compatibility issues) and rendering it from the data, rather than just rendering a downsampled signal directly into an SVG, which would be preferable. It'd be easy enough to get a suitably high resolution vector while removing >>90% of the detail from a high resolution feature time series.

bmcfee commented 8 years ago

I agree with SVG, it means you can take it out of the web context and render it in LaTeX or somewhere else that may be useful. It's also a nice format because it's easy to remove pieces in a text editor rather than having to fire up an image editor as you would have to with a rasterised image saved from a canvas.

Great points; I hadn't thought of exporting to figure, but it would be pretty trivial this way!

As for curves, for some reason I was thinking about embedding javascript in an SVG (which is possible, though probably not a good idea for compatibility issues) and rendering it from the data, rather than just rendering a downsampled signal directly into an SVG, which would be preferable. It'd be easy enough to get a suitably high resolution vector while removing >>90% of the detail from a high resolution feature time series.

Right. We may have relatively high sampling rates for curve annotations, but my understanding is that they're primarily there to capture abrupt changes, and the curves themselves tend to be relatively smooth otherwise. It wouldn't be hard to do some kind of compression (eg run-length encoding) to retain the visual characteristics of the annotation without losing too much.

I'm storing my data in an annotation, which may be wrong. It'd be great to discuss that at some point if you wouldn't mind, but as for generally dealing with this kind of data just putting would probably suffice.

I think this part is getting off-topic, and would be better handled in a separate thread.

hughrawlinson commented 8 years ago

That last bit was definitely off topic, sorry!

justinsalamon commented 8 years ago

Coming back to this, it would be awesome to have piano-roll style viz for note-like annotations. A possible complication is that relevant namespaces (e.g. pitch_midi) can be used to store dense annotations (e.g. continuous f0 values) or sparse ones (e.g. notes with start/duration), and it would only make sense to use this viz for the latter. I'm sure @bmcfee has an idea about how to handle this.

Another thing that would be awesome is the ability to overlay annotations for the purpose of visual comparison (e.g. references notes versus automatically transcribed notes).

bmcfee commented 8 years ago

I'm sure @bmcfee has an idea about how to handle this.

If we were doing this via d3, I'd say that it's all just a matter of how things are interpolated on display (quantization for notes vs cubic interp for f0).

Another thing that would be awesome is the ability to overlay annotations for the purpose of visual comparison

Agree -- that would be awesome!

justinsalamon commented 8 years ago

If we were doing this via d3, I'd say that it's all just a matter of how things are interpolated on display (quantization for notes vs cubic interp for f0).

But right now pitch_midi can be used for either (notes and f0) - so how would you know which one to use? I guess it could be a user-specified option to the viz module?

bmcfee commented 8 years ago

But right now pitch_midi can be used for either (notes and f0) - so how would you know which one to use?

it can be, but generally you wouldn't do that ; f0 would live in pitch_hz. If you want to switch between the two, auto-convert before viz.

justinsalamon commented 8 years ago

Fair enough

bmcfee commented 8 years ago

I did a bunch of thinking about this, and I have some slightly less scrambled ideas for how it might work.

All horizontal element scaling should be relative to the total track duration, so that the SVG can stretch dynamically.

I think for most namespaces, it makes sense for each observation to be an svg group, which itself contains one or more elements, along with some encoding of its raw properties (t,d,v,c). The time and duration properties of the observation can be turned into group transformations (translate and scale, respectively). This should take care of all interval-based annotations, more or less.

For namespaces where intervals may overlap (eg, tags), we'll need some way of grouping the observations into non-overlapping subsets with the same value. I think we're just going to have to suck it up and do this logic manually, but it shouldn't be so bad.

For chords, do we want to use chord labels, or convert into a chromagram with bass and root emphasized (say, by alpha and texture)? I would still color-code each chord by name in a consistent way. The grouping idea above works just as well here.

Piano roll ought to be trivial.

Continuous f0 is a sticky one -- i think it would be best encoded as a collection of polyline/paths, where each collection is a contiguous sequence of observations with no breaks in voicing. This breaks the grouping logic I mentioned above, but I'm not sure how else to do it.

Some questions:

justinsalamon commented 8 years ago

Do we care about rendering axis labels with this stuff?

I think we do - without a time axis the usefulness of the visualization will be quite limited. Similarly, for (e.g.) piano-roll style viz it would be very useful to know the actual pitch of the notes, which requires ytick labels.

justinsalamon commented 8 years ago

I'd rather not reinvent all of matplotlib (or use it at all).

Actually, why not use matplotlib?

bmcfee commented 8 years ago

Actually, why not use matplotlib?

We could, but it's a pretty big dependency. It would also limit the amount of control we have on the svg coming out, since it adds a rendering abstraction layer in the middle. This would make it harder, I think, to generate svg that could be manipulated in-browser and backed out to jams again.

justinsalamon commented 8 years ago

We could, but it's a pretty big dependency.

I think it's relatively safe to assume anyone using python + jams will be using matplotlib too anyway.

This would make it harder, I think, to generate svg that could be manipulated in-browser and backed out to jams again.

Ah, I wasn't aware in-browser manipulation was on the table. So this would serve not just for visualization, but would in essence be a web-based interface for correcting annotations? Because then you also want to support sonification of the annotation, concurrent sonification of the annotation and an audio file, and probably a bunch of other stuff I haven't thought about.

bmcfee commented 8 years ago

think it's relatively safe to assume anyone using python + jams will be using matplotlib too anyway.

True, but it is bloated and complicated relative to what we need, i think.

So this would serve not just for visualization, but would in essence be a web-based interface for correcting annotations?

Visualizing, first. The idea being that with a little bit of javascript, it would be easy to, say, make observations clickable and automatically seek to the corresponding point in the audio playback widget. Seymour does this, and it's super useful.

Editing is a much more complicated beast, but it will have to happen eventually. I'm not sure that svg is the right tool for the job here, but I suspect you would want something that hooks into the DOM, is queryable, and flexible enough to contain all everything we need to make a jams annotation. Canvas is definitely not the answer.

Because then you also want to support sonification of the annotation

We have that already. In a browser context, it would have to be done via remote query, but that seems totally doable.

The alternative would be rewriting sonify in javascript, and that makes me seasick.

concurrent sonification of the annotation and an audio file,

Not sure what you mean by this; mixing with the existing audio track? I have a notebook demo that does this.

urinieto commented 8 years ago

Just my $.02:

ejhumphrey commented 8 years ago

+1

@bmcfee, my memory is junk wrt to seymour ... do you have some kind demo (?) to refresh on? I found the repo but got all googley-eyed

On Wed, Mar 30, 2016 at 12:42 PM, Oriol Nieto notifications@github.com wrote:

Just my $.02:

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/marl/jams/issues/93#issuecomment-203518620

bmcfee commented 8 years ago

@bmcfee, my memory is junk wrt to seymour ... do you have some kind demo (?) to refresh on? I found the repo but got all googley-eyed

Not really -- the server is password protected, and often down for reasons beyond mine (or anyone's) control. If you remind me in person, I can show it to you though.

justinsalamon commented 8 years ago

So, this is a little off-target, but since I needed it for a project, I've hacked up a very ad-hoc pitch_midi annotation viz: https://gist.github.com/justinsalamon/2fa2b23f6ca79e85ef8bca4f98a353d2

Gives you this: note_viz

Not editable, depends on matplotlib, and a whole lot of other evils. But super quick and does the trick.

justinsalamon commented 8 years ago

@bmcfee I guess my question is: is it not worth developing a simple matplotlib-based viz (for every annotation namespace) while grander schemes are being hatched? The more I work with JAMS files the more I need annotation visualization. Pragmatically, viz-only is a PR we can probably manage short-term, whilst an editable interface is probably more involved.

On a related note, would the plotly API (or the javascript library directly) be an interesting option?

bmcfee commented 8 years ago

is it not worth developing a simple matplotlib-based viz (for every annotation namespace) while grander schemes are being hatched?

I'd prefer to minimize redundant work, but we also shouldn't let perfect be the enemy of good.. so if you have an idea for how to make this work, go for it.

I haven't looked much at plotly -- I don't want to depend on their API, but plotly.js is MIT-licensed, so that could work. If we do go that route, it might be a good idea to just have everything live as an independent project (jams-viz), since it would invoke a pretty heavy set of dependencies.

justinsalamon commented 8 years ago

but we also shouldn't let perfect be the enemy of good.. so if you have an idea for how to make this work, go for it.

Not sure it's a good idea, but it's an idea... basically we'd have a jams.viz module (or some other name), and in it's simplest form it could have a single function that takes an annotation and plots it. Since every annotation has a namespace, we could have a separate visualization function that's called under the hood based on the annotation's namespace.

That's basically it. Probably instead of taking a single annotation I'd support taking a list of annotations so that annotations can be plotted on top of each other. I'd probably only support overlapping annotations if they're of the same namespace to keep things simple, so perhaps first thing would be to sort the annotations by namespace, and then all annotations that have the same namespace get plotted together (or separately, could be an optional argument).

It's another dependency, but it would also be nice to include mpld3 which would make all the plots zoomable/panable in jupyter.

EDIT: I guess the user can just import mpld3 locally, no need to make it a dependency per-se.

bmcfee commented 8 years ago

Not sure it's a good idea, but it's an idea... basically we'd have a jams.viz module (or some other name), and in it's simplest form it could have a single function that takes an annotation and plots it. Since every annotation has a namespace, we could have a separate visualization function that's called under the hood based on the annotation's namespace.

I'd like to avoid adding a heavy dependency (eg matplotlib) if we can avoid it. I'd also like to avoid piecemeal importing of submodules.

Otherwise, yeah, that's exactly how I'd conceived of the viz module working -- pretty much exactly like sonify, or eval for that matter.

It's another dependency, but it would also be nice to include mpld3 which would make all the plots zoomable/panable in jupyter.

EDIT: I guess the user can just import mpld3 locally, no need to make it a dependency per-se.

You don't even need that if all you want is pan/zoom: just use %matplotlib notebook or %matplotlib nbagg.

The more you know!

justinsalamon commented 8 years ago

I'd like to avoid adding a heavy dependency (eg matplotlib) if we can avoid it. I'd also like to avoid piecemeal importing of submodules.

So what do you suggest? matplotlib seems like the most straight forward option (assuming we want good, not perfect)?

justinsalamon commented 8 years ago

Unless anyone has a better idea for the time being (?), I might take a stab at implementing a very simple matplotlib-based vis module.

bmcfee commented 8 years ago

Just dumping this here for future reference: vega seems promising as a high-level language to making this kind of diagram, and vincent is a python layer on top of that. It doesn't add any dependencies we don't already have (ie, just pandas), so it might be a good option if/when it's stable.

justinsalamon commented 8 years ago

Vincent seems dead?

2015-08-12 Update

Vincent is essetially frozen for development right now, and has been for quite a while. The features for the currently targeted version of Vega (1.4) work fine, but it will not work with Vega 2.x releases. Regarding a rewrite, I'm honestly not sure if it's worth the time and effort at this point.

There is a new project seeking to integrate Vega-related tools with the IPython Notebook: https://github.com/uwdata/ipython-vega-lite

Work is ongoing, and will probably supersede the need for a new version of Vincent.
bmcfee commented 8 years ago

Punting this to 0.3 since display covers most of the preliminary use-cases.

bmcfee commented 7 years ago

149 implements a custom html renderer for annotation data to recover the previous behavior of JamsFrame.

I think it would be pretty simple to have a generic JObject renderer that uses collapsible list groups to give a compact view of the jams dom, but without any fancy visualization. Maybe I'll take a crack at this in the near future.

bmcfee commented 7 years ago

Looks like the way to go about nested printing is via something like:

<details>
    <summary> top-level summary </summary>

    detailed data dump
</details>
bmcfee commented 7 years ago

I'd like to throw on a reimplementation of JObject.__repr__ to this issue as well. As mentioned elsewhere, it currently provides no information about the object beyond its type.

bmcfee commented 7 years ago

Fixed by merging #158