Geometry representation

SimonDanisch commented 8 years ago

I've been thinking about some kind of low level/ intermediate representation for geometry lately to have a good way to switch out frontends and backends without much hassle. This is pretty much what I've come up with after working with Compose, GLVisualize and FireRender. They all differ hugely, but it seemed to me that they could all support the same intermediate representation.

This representation needs a great deal of flexibility, since we don't want to force any frontend/backend to miss out on compression/batching or force them to use one specific memory layout. To not go crazy over the resulting diversity, I'd like to propose the following:

I want to use something like an extended version of @simonster's StructsOfArrays library (with added support for iterators, scalars and different memory layouts) to allow for the kind of compression and memory layout independence that GLVisualize and Compose already offer in some restricted form.

A few examples:

SOA = StructsOfArrays
# A thousand circles spreaded evenly on the x axis  with minimal memory usage:
xposition = 1:1000
yposition = 0
radius = 1
# this might remind you of Composes's circle([0.25, 0.5, 0.75], [0.25, 0.5, 0.75], [0.1])
circles = SOA(Circle, xposition, yposition, radius) 
circles[100] == Circle(Point2f0(100,0), 1)

# Now a different memory layout (Maybe we got it from some sensor and we don't
# want to copy millions of circles just because of different memory layout, do we?)
xypositions = rand(Point2f0, 10^6)
circles = SOA(Circle, xypositions, radius)

# this can be extended to all kind of objects
x,y = linspace(0,10,100), rand(100)
lines = SAO(Line{Point}, x, y)

# or a simple representation of particles
immutable Instance{P
primitive::P
translation::Vec3f0
scale::Vec3f0
rotation::Quaternion
end

cube = Cube(Vec3f0(0),1f0) #unit cube
# Cubes on random positions
SOA(Instance, cube, rand(Vec3f0, 1000), Vec3f0(0.01), Quaternion(1,0,0,0))
x,y,z = rand(Float32, 10), rand(Float32, 10), rand(Float32, 10)
# the same, but different memory layout
SOA(Instance, cube, x,y,z, Vec3f0(1), Quaternion(1,0,0,0))
grid = Grid(linspace(0,1,100), linspace(0,1,100))
#100*100 particles with varying z scale on a grid
SOA(Instance, grid, 0,0, rand(Float32, 100, 100), Quaternion(1,0,0,0))

#GLVisualize's particle system actually partly supports this kind of batching and memory compression

#The same can be applied to materials:
immutable Material{S, C, Refl, Refr}
shader::S
color::T
reflectance::Refl
refraction::Refr
end
SOA(Material, Phong, rand(RGBA, 10), RGB(1,1,1), 1.5)

# and finally something like this:
immutable Drawable{G, M, T}
geometry::G
material::M
translation::T
end
SAO(Drawable, ....)
# With this, we could also better separate mesh geometry from the material properties.

# note, that we could represent even odd geometries like volumes and units like mm:
geometry = Cube(Vec{3, mm}(0), Vec{3, mm}(1.75, 1.2, 2.7) )
material = VolumeMaterial(image3D, absorption)
Drawable(geometry, material)

With clever conversion and defaults for unsupported attributes, this can represent everything we might want to visualize. There will be annoying corner cases like pointed lines, where it's not really clear if the points are a geometry attribute or part of the shader. But I'm pretty sure that we'll be able to solve this in a coherent way.

Benefits

All the algorithms can work on this representation, making it way easier to reuse them in all the different packages. E.g. the implementation of a boundingbox algorithm could look as simple as this:

mapreduce(BoundingBox, union, some_sao) # obviously, BoundingBox has to be implemented for the primitive
# this could work with any of the geometry created in the examples above

In theory, this should also work on the GPU. So whenever we get better GPU support e.g. being able to compile Julia to SPIR-V, the data structures and algorithms should map nicely to the GPU.

It will be easier to display graphics created with e.g. Gadfly, Compose, GLVisualize in any backend like OpenGL, Cairo, Skia, SVG, FireRender, WebGL, PDF, as long as they support this intermediate representation. I've already implemented quite a bit for OpenGL and FireRender and it shouldn't take too much time to add WebGL via Threejs.jl (if the package is in a good state).

Challenges

I'm not sure how to incorporate Animations. With Reactive there are two ways to do so, which I find both unsatisfactory.

width_signal = Signal(rand(Vec3f0, 10^6)
position_signal = Signal(rand(Vec3f0, 10^6)

option1 = map(SAO, Cube, scale_signal, width_signal)
-> Signal{SAO{Cube, Vec3f0, Vec3f0}}

option2 = SAO(Cube, scale_signal, width_signal)
-> SAO{Cube, Signal{Vec3f0}, Signal{Vec3f0}}

Option1 is way cleaner, as you can easily apply algorithms to it via map. Option2 doesn't really allow to operate on the resulting SAO without some hackage. But I'm not really sure how to make option1 fast on the backend side in all cases. If you let the signal get bigger and bigger, it's not entirely clear how to detect what has changed anymore, which means it's more difficult to only update part of the scene (which is an important optimization). @shashi gets around this by diffing, but for big data-sets this still introduce a non trivial performance penalty. Since this is supposed to be a low level representation, it would be nice to not have such penalties early on.

Also, this must be implemented ;) I've working prototypes for most of this, so I'd be willing to start this off! Maybe, by adding WebGL support to GLVisualize.

Please criticize freely :)

Best, Simon

CC: @sjkelly, @KristofferC, @rohitvarkey, @tbreloff, @viralbshah, @timholy, @dcjones, @vchuravy

tbreloff commented 8 years ago

Thanks for this @SimonDanisch. This is heavily related to what I try to do in Plots, which is creating compressed, equivalent definitions of generic visualization parameters. As an example, consider the following plot: (note: I had to make a couple fixes to get this to work... fixes on dev branch... always good to find bugs :)

using Plots; pyplot(size=(500,200))
scatter(1:10, [0], m=([5,20], [:green,:red,RGBA(0,1,1,0.3)]))

tmp

The gist is that each element is stored in a compressed, generalized representation of the final visualization attributes, and only expanded if/when necessary for a specific backend (as some backends can handle inputs in compressed form as well). The y value is a 1-element vector, even though it represents a 10-element vector with all values the same. This m arg gets expanded so that the first vector is mapped to the markersize arg and the second vector is mapped to be a Plots.ColorVector applied to the markercolor arg. Both represent 10-element vectors, but can be input/stored in compressed form.

@SimonDanisch I hope that as you are developing these concepts, that you can keep me in the loop as to your intentions, and we can have a solution that allows interop between whatever you need and the Plots representations. (also selfishly because I really want to be able to generate GLVisualize stuff using Plots!)

rohitvarkey commented 8 years ago

@SimonDanisch This sounds great! And I'll be happy to help out regarding implementing WebGL support to GLVisualize. Do let me know what you need by raising issues. :smile:

SimonDanisch commented 8 years ago

Great, thanks a lot @rohitvarkey.

Here are some more thoughts:

The old struggle how to layout your vertexarray will become a lot simpler. We can just use the format the user supplies, and if we know that some layout is faster with some backend, we can transform the data and/or document this for the user.

E.g. there is no clear standard on how to represent meshes:

immutable Vertex
position::Vec3f0
normal::Vec3f0
color::RGBA{Float32}
end
vertices = Array{Vertex}
is often seen, but this is also an often seen layout:
immutable Vertex
position::Vector{Vec3f0}
position::Vector{Vec3f0}
color::Vector{RGBA{Float32}}
end

We can combine this nicely with our decompose api, which helps you to get the types and memory layout you need for your backend. (decompose should probably get renamed to collect)

Also, we could make views and index lists a standard tool in the geometry representation. OpenGL and especially Vulkan have very good support for views and indexing lists, so it'd be nice to cleanly represent them in the geomtry layer already :)

cc: @yeesian

KristofferC commented 8 years ago

Personally I like the structure of arrays style more since it seems to often perform better.

SimonDanisch commented 8 years ago

That's exactly the idea here... The mesh examples are a bit confusing, since they don't actually use StructsOfArrays... They should actually look like this:

SOA(Vertex, rand(Vec3f0, 10), rand(Vec3f0, 10), rand(RGBA{Float32}, 10))

Vs

SAO(Vertex, [Vertex(...) for i=1:10])

Because of the SAO they look the same to the algorithms, but users or backend can settle on what they prefer in terms of speed or usability. We can also offer documentations for each backend and algorithms, lining out what is expected to be faster... From my experience, there is not always a clear winner in terms of performance ;)

Personally I like the structure of arrays style more since it seems to often perform better.

shashi commented 8 years ago

StructOfArrays could be great for performance! The animation problem troubles me too, but what you describe is already a problem, right? this should be a strict improvement. We still have to address the problem of whether e.g. Circle should be defined in a central package, and how much parameterization such a thing should have - this was the thing we discussed failed to come to common grounds on at JuliaCon, but it might be time to revisit this.

SimonDanisch commented 8 years ago

Ah yeah, that's what I wanted to address as well. The example for why it would be convenient to have different units per vector was the one of plotting data with space vs time. I wasn't really sure at that time how to think about it... Nowadays, I don't think that it is a viable use case. Point{SpaceX, TimeY} would be part of two different spaces. Also, they usually come from different columns, so it's not like we already have Point{SpaceX, TimeY}. What makes more sense to me is to define a conversion for time to space, and then say, the time axis maps to space like this: time2space(t) = mm((t*10)+2). Then we get vectors of space only ;) You wouldn't really like to do have Point{Liter, Natrium/gram}, just because your data plots the amount of natrium per liter of water, would you? For drawing, you really just care about how this maps to mm on your canvas in the end. You could also just reinterpret your data to mm, and then define a transformation matrix doing the scaling/offsetting for you, without copying the data at all. You could easily change the mapping interactively even for big data sets in that case :) For real time data, we should use signals, I dare say :)

SimonDanisch commented 8 years ago

The animation problem troubles me too, but what you describe is already a problem, right?

Yes, but now it's harder for me to cheat around this... Which is sort of a good thing, because it wasn't really a clean solution before... But since I don't have new clean solution, this might as well be the main blocking issue...

Jeffrey-Sarnoff commented 8 years ago

@SimonDanisch

You wouldn't really like to do have Point{Liter, Natrium/gram}, just because your data plots the amount of natrium per liter of water, would you?

No, I would not like to be required do this. Yes, I would very much like to be able to do this in some situations: Point{ TownshipGeocode, Income/household }, just because my data plots the township-relative wealth per household and comes from a relational database holding disaggregated data.

Jeffrey-Sarnoff commented 8 years ago

@SimonDanisch For your consideration, an additional capability that would be helpful to design into low level animation support: When animating data to visualize how things change with/through time, the ability to run the animation using differently 'warped' time progressions can be informative. (The time axis is differentially dilated, as given through, e.g. a splined curve or nonuniform frame timestamps sampled from it). It is akin to zooming in on an area of interest in an image.

SimonDanisch commented 8 years ago

using differently 'warped' time progressions can be informative

That should map down to the ability to animate the parameters of time, which should be easily feasible.

Yes, I would very much like to be able to do this in some situations: Point{ TownshipGeocode, Income/household },

My biggest concern with this is that it breaks a lot of assumptions (e.g. what is eltype, promote), while there are totally legitimate workarounds. You already say, they might come from different stores, so there is no immediate need to combine them into a point. Also, it's not the case, that any generic library can do clever things with random combinations of point element types. And if there is a generic library that explicitly utilizes the type information from Point{ TownshipGeocode, Income/household } you can still do this:

immutable TIPoint{N, T} <: FixedVector{N,T}
data::Tuple{TownshipGeocode{T}, Income{T}}
end

Or another approach could be this:

convert{T}(::Type{cm}, x::TownshipGeocode) = cm(x*2.1)
convert{T}(::Type{cm}, x::Income) = cm(x*3.712)

With this, plotting libraries should be able to automatically convert Tuple{TownshipGeocode{T}, Income{T}} to Point{2, cm}....

I think this will keep things sane. The type diversity is already immense as it is!

JuliaGeometry / GeometryTypes.jl

Geometry representation #46

Benefits

Challenges