Open SimonDanisch opened 8 years ago
Thanks for this @SimonDanisch. This is heavily related to what I try to do in Plots, which is creating compressed, equivalent definitions of generic visualization parameters. As an example, consider the following plot: (note: I had to make a couple fixes to get this to work... fixes on dev branch... always good to find bugs :)
using Plots; pyplot(size=(500,200))
scatter(1:10, [0], m=([5,20], [:green,:red,RGBA(0,1,1,0.3)]))
The gist is that each element is stored in a compressed, generalized representation of the final visualization attributes, and only expanded if/when necessary for a specific backend (as some backends can handle inputs in compressed form as well). The y
value is a 1-element vector, even though it represents a 10-element vector with all values the same. This m
arg gets expanded so that the first vector is mapped to the markersize
arg and the second vector is mapped to be a Plots.ColorVector
applied to the markercolor
arg. Both represent 10-element vectors, but can be input/stored in compressed form.
@SimonDanisch I hope that as you are developing these concepts, that you can keep me in the loop as to your intentions, and we can have a solution that allows interop between whatever you need and the Plots representations. (also selfishly because I really want to be able to generate GLVisualize stuff using Plots!)
@SimonDanisch This sounds great! And I'll be happy to help out regarding implementing WebGL support to GLVisualize. Do let me know what you need by raising issues. :smile:
Great, thanks a lot @rohitvarkey.
Here are some more thoughts:
The old struggle how to layout your vertexarray will become a lot simpler. We can just use the format the user supplies, and if we know that some layout is faster with some backend, we can transform the data and/or document this for the user.
E.g. there is no clear standard on how to represent meshes:
immutable Vertex
position::Vec3f0
normal::Vec3f0
color::RGBA{Float32}
end
vertices = Array{Vertex}
is often seen, but this is also an often seen layout:
immutable Vertex
position::Vector{Vec3f0}
position::Vector{Vec3f0}
color::Vector{RGBA{Float32}}
end
We can combine this nicely with our decompose
api, which helps you to get the types and memory layout you need for your backend. (decompose
should probably get renamed to collect
)
Also, we could make views and index lists a standard tool in the geometry representation. OpenGL
and especially Vulkan
have very good support for views and indexing lists, so it'd be nice to cleanly represent them in the geomtry layer already :)
cc: @yeesian
Personally I like the structure of arrays style more since it seems to often perform better.
That's exactly the idea here... The mesh examples are a bit confusing, since they don't actually use StructsOfArrays... They should actually look like this:
SOA(Vertex, rand(Vec3f0, 10), rand(Vec3f0, 10), rand(RGBA{Float32}, 10))
Vs
SAO(Vertex, [Vertex(...) for i=1:10])
Because of the SAO they look the same to the algorithms, but users or backend can settle on what they prefer in terms of speed or usability. We can also offer documentations for each backend and algorithms, lining out what is expected to be faster... From my experience, there is not always a clear winner in terms of performance ;)
Personally I like the structure of arrays style more since it seems to often perform better.
StructOfArrays could be great for performance! The animation problem troubles me too, but what you describe is already a problem, right? this should be a strict improvement. We still have to address the problem of whether e.g. Circle
should be defined in a central package, and how much parameterization such a thing should have - this was the thing we discussed failed to come to common grounds on at JuliaCon, but it might be time to revisit this.
Ah yeah, that's what I wanted to address as well.
The example for why it would be convenient to have different units per vector was the one of plotting data with space
vs time
. I wasn't really sure at that time how to think about it...
Nowadays, I don't think that it is a viable use case. Point{SpaceX, TimeY}
would be part of two different spaces. Also, they usually come from different columns, so it's not like we already have Point{SpaceX, TimeY}
.
What makes more sense to me is to define a conversion for time to space, and then say, the time axis maps to space like this: time2space(t) = mm((t*10)+2)
. Then we get vectors of space only ;)
You wouldn't really like to do have Point{Liter, Natrium/gram}
, just because your data plots the amount of natrium per liter of water, would you?
For drawing, you really just care about how this maps to mm on your canvas in the end. You could also just reinterpret your data to mm
, and then define a transformation matrix doing the scaling/offsetting for you, without copying the data at all. You could easily change the mapping interactively even for big data sets in that case :)
For real time data, we should use signals, I dare say :)
The animation problem troubles me too, but what you describe is already a problem, right?
Yes, but now it's harder for me to cheat around this... Which is sort of a good thing, because it wasn't really a clean solution before... But since I don't have new clean solution, this might as well be the main blocking issue...
@SimonDanisch
You wouldn't really like to do have Point{Liter, Natrium/gram}, just because your data plots the amount of natrium per liter of water, would you?
No, I would not like to be required do this. Yes, I would very much like to be able to do this in some situations: Point{ TownshipGeocode, Income/household }, just because my data plots the township-relative wealth per household and comes from a relational database holding disaggregated data.
@SimonDanisch For your consideration, an additional capability that would be helpful to design into low level animation support: When animating data to visualize how things change with/through time, the ability to run the animation using differently 'warped' time progressions can be informative. (The time axis is differentially dilated, as given through, e.g. a splined curve or nonuniform frame timestamps sampled from it). It is akin to zooming in on an area of interest in an image.
using differently 'warped' time progressions can be informative
That should map down to the ability to animate the parameters of time, which should be easily feasible.
Yes, I would very much like to be able to do this in some situations: Point{ TownshipGeocode, Income/household },
My biggest concern with this is that it breaks a lot of assumptions (e.g. what is eltype
, promote
), while there are totally legitimate workarounds. You already say, they might come from different stores, so there is no immediate need to combine them into a point.
Also, it's not the case, that any generic library can do clever things with random combinations of point element types. And if there is a generic library that explicitly utilizes the type information from Point{ TownshipGeocode, Income/household }
you can still do this:
immutable TIPoint{N, T} <: FixedVector{N,T}
data::Tuple{TownshipGeocode{T}, Income{T}}
end
Or another approach could be this:
convert{T}(::Type{cm}, x::TownshipGeocode) = cm(x*2.1)
convert{T}(::Type{cm}, x::Income) = cm(x*3.712)
With this, plotting libraries should be able to automatically convert Tuple{TownshipGeocode{T}, Income{T}}
to Point{2, cm}
....
I think this will keep things sane. The type diversity is already immense as it is!
I've been thinking about some kind of low level/ intermediate representation for geometry lately to have a good way to switch out frontends and backends without much hassle. This is pretty much what I've come up with after working with Compose, GLVisualize and FireRender. They all differ hugely, but it seemed to me that they could all support the same intermediate representation.
This representation needs a great deal of flexibility, since we don't want to force any frontend/backend to miss out on compression/batching or force them to use one specific memory layout. To not go crazy over the resulting diversity, I'd like to propose the following:
I want to use something like an extended version of @simonster's
StructsOfArrays
library (with added support for iterators, scalars and different memory layouts) to allow for the kind of compression and memory layout independence that GLVisualize and Compose already offer in some restricted form.A few examples:
With clever conversion and defaults for unsupported attributes, this can represent everything we might want to visualize. There will be annoying corner cases like pointed lines, where it's not really clear if the points are a geometry attribute or part of the shader. But I'm pretty sure that we'll be able to solve this in a coherent way.
Benefits
All the algorithms can work on this representation, making it way easier to reuse them in all the different packages. E.g. the implementation of a boundingbox algorithm could look as simple as this:
In theory, this should also work on the GPU. So whenever we get better GPU support e.g. being able to compile Julia to SPIR-V, the data structures and algorithms should map nicely to the GPU.
It will be easier to display graphics created with e.g.
Gadfly
,Compose
,GLVisualize
in any backend likeOpenGL
,Cairo
,Skia
,SVG
,FireRender
,WebGL
,PDF
, as long as they support this intermediate representation. I've already implemented quite a bit forOpenGL
andFireRender
and it shouldn't take too much time to addWebGL
viaThreejs.jl
(if the package is in a good state).Challenges
I'm not sure how to incorporate Animations. With Reactive there are two ways to do so, which I find both unsatisfactory.
Option1
is way cleaner, as you can easily apply algorithms to it viamap
.Option2
doesn't really allow to operate on the resultingSAO
without some hackage. But I'm not really sure how to makeoption1
fast on the backend side in all cases. If you let the signal get bigger and bigger, it's not entirely clear how to detect what has changed anymore, which means it's more difficult to only update part of the scene (which is an important optimization). @shashi gets around this by diffing, but for big data-sets this still introduce a non trivial performance penalty. Since this is supposed to be a low level representation, it would be nice to not have such penalties early on.Also, this must be implemented ;) I've working prototypes for most of this, so I'd be willing to start this off! Maybe, by adding WebGL support to GLVisualize.
Please criticize freely :)
Best, Simon
CC: @sjkelly, @KristofferC, @rohitvarkey, @tbreloff, @viralbshah, @timholy, @dcjones, @vchuravy