PistonDevelopers / graphics

A library for 2D graphics, written in Rust, that works with multiple back-ends
MIT License
479 stars 55 forks source link

Switch to user space coordinates by default [BREAKING CHANGE] #1108

Closed bvssvni closed 5 years ago

bvssvni commented 5 years ago

Currently, the graphics backends uses frame buffer coordinates by default, which confuses users since the graphics does not scale properly on HDI screen resolutions.

See https://github.com/PistonDevelopers/glutin_window/issues/145

By switching to user space coordinates by default, this confusion might be avoided.

0e4ef622 commented 5 years ago

Should Window::size() be made to return a floating point value instead of an integer? Since one point might not equal a whole number of physical pixels, as is the case with my 144 DPI display.

bvssvni commented 5 years ago

One point should be the ratio, so as long the window size divided by the frame buffer size is a tuple of rational numbers, there is no need for a floating point.

So, if the window size is a tuple of whole numbers, then it's per definition a ratio with the frame buffer size. In principle, this means that a floating point for window size is not needed.

0e4ef622 commented 5 years ago

I'm not sure I fully understand your reasoning. If there are 3 pixels for every 2 points, wouldn't a window size of 11x11 pixels require floating point to accurately represent the size of the window in points? Glutin reports window sizes in floating point, and getting glutin to report a non-integer window size is trivial when resizing.

bvssvni commented 5 years ago

Ah, yes. I'm wrong.

bvssvni commented 5 years ago

Hmm... so 11*3/2 is the framebuffer size? Or do you mean the visible area of the framebuffer?

I think you meant that if e.g. the framebuffer is 8x8, then the window size is 8*2/3, right? It means the window manager has to set the framebuffer to 9x9 to avoid artifacts. It can't use partial pixels.

However, to figure out the right ratio from window and buffer size might be tricky.

0e4ef622 commented 5 years ago

My understanding is the window manager only works in physical pixels, and the framebuffer is also in physical pixels, so there shouldn't be any issue there. The issue is converting from physical pixels to points when piston reports the window size.

bvssvni commented 5 years ago

Ah. OK. I've opened https://github.com/PistonDevelopers/piston/issues/1254

dobkeratops commented 5 years ago

I dont have involvement or use od this project but i'll just mention as a passer by...

if you need 64bits for coordinates in 2D UI.. you're probably doing it wrong.

I should be possible by virtue of the way you traverse nested layout structures and organize when and where rescalings are done to reason about screen coordinates with 32 bits.

People used to make space games where you can fly between planets using just 16bit arithmetic, by smart management of nested coordinate systems .. it should be possible to construct a UI framework where coordinates are within parent frames - you start with 32bit screen pixels then zoom into elements where fewer bits are needed

of course this would take more thought/code to organise; you argument might be that f64 allows a simpler system.

I guess it's down to you what challenge you are pursuing. but computers dont just get more powerful.. they also get smaller .. (reducing energy use is the biggest issue in our age). part of rusts potential is for IoT which means running on small embedded devices. progress in AI (very different to UI frameworks I will admit :) ) has driven aggressive bit-reduction (great benefit from figuring out where you can drop from 32bits to 16bits or even 8) .

I guess it just raises an alarm bell when I see this being talked about... ultimately you know what scenarios you are targeting better than me.

bvssvni commented 5 years ago

@dobkeratops f64 has better whole number precision than f32. It supports the whole range of u32, that's why. Whole number precision is important in this use case, since the old API uses u32.

In the general case, piston-graphics uses f32 under the hood, but after coordinate transformation. Before coordinate transformation, the use of f64 is low-frequent in memory, because it's at the abstract level. CPUs are optimized for f64, so using f32 might reduce the performance.

I think piston-graphics makes the optimal trade-off here. Embedded programming is different, though, but I don't think that piston-graphics should focus on that. If CPUs got support for projective reals, then you could drop the size of the data type somewhat, maybe even to 16 bit while keeping reasonable accuracy, but this won't happen anytime soon. Maybe embedded hardware should try exploit that projective reals have higher accuracy at lower bits?

When talking about what happens in the GPU - that's a different story.

bvssvni commented 5 years ago

@dobkeratops I meant projective reals with interval arithmetic. There has been some interesting research on this the past few years.

dobkeratops commented 5 years ago

"CPUs are optimized for f64, so using f32 might reduce the performance." heh. a cpu in a graphics oriented machine shouldn't be..

you're certainly right that converting between u32 -> f32 is lossy. (perhaps whats needed is a u24 type to account for the integer part) but i've never worried about that; if the 'lions share' of the arithmetic is going to be in f32 then you work back from that r.e. useable ranges

CPUs are optimized for f64, so using f32 might reduce the performance. When talking about what happens in the GPU - that's a different story.

(i) this is the thing.. to me GPUs have made a good choice going for f32 as the default, reasonable balance and when you need more precision/range , good use of relative centering works; the CPU is almost there just to arrange data for a GPU as such I would think primarily in GPU datatypes

(ii) and I disagree ('might reduce..'); taking up less space in caches will always be the dominant factor; CPUs have good double-precision support but that doesn't mean they neglect 32bit - they've always had vectorised 32bit since SSE and smaller types mean more done in parallel (and intel know nvidia encroach into their market share in datacentres so they care about making CPUs more parallel.. they've set off on the trend with 'vgather' instructions to make autovectorization easier.. a smaller type means more loop iterations packed into the same vector register)

Ultimately I dont want to preach too much here - i'm not trying to use this library so this doesn't cause me any problems ; I just feel compelled to drop an opinion... it really jumps out at me when I see 64bits used for something like this.

I am actually doing a bit of 2d/windowing stuff at the moment for a 3d editor - I had wondered if I could use Conrod. Basically for my needs all the layouts start with the full window then divide it up, and there's transformation of 3d stuff to a 2d screen; and if 32bit is fine for flying a camera around in 3d as per GPU, it's certainly fine for the 2d..

bvssvni commented 5 years ago

@dobkeratops You also have to test it, or else you end up with circular reasoning.

dobkeratops commented 5 years ago

the evidence and salient factors in my head -

So .. for anything I've ever thought about... 32 bit is an obvious choice. parameterizing is nice, I have written my own maths libraries with Vec3 * etc, but i've tended to explicitly use f32 in 'user code' ; i could change that to a typedef ('UICoord', 'CoordValue' etc) although those are usually embedded in things like "ScreenPos","ScreenRect" types etc.

there is a parallel issue with the way cameras are dealt with; i always prefer to have an explicit 'camera object' (position, axes) rather than matrix/inv matrix to ensure i can do these 'inverse multiply by first offsetting a camera centre before doing the axis multiplies

very early on I was exposed to mixed precision ie. 32bit centre/16bit axes (playstation) and whilst GPUs have predominantly been '32bit everywhere' AI is driving a resurgence in 16bit support - I want to be ready to reduce bits wherever I can

(in my matrix types i do keep the option for the last column to be a different type, and even for the vectors to be a different type per elem e.g. if i did go back to different precision for axes & centre, and something needed to be transposed, it would be able to represent that.. whether its axes in 16 & position in 32 or axes in 32, position in 64.. i've actually got Vec4<X=f32,Y=X,Z=Y,W=Z> ready for this. one more tweak i have in mind is to allow representing a 3d 'point' or 'axis' via a type which is constant One Zero ..

bvssvni commented 5 years ago

@dobkeratops You forgot to test it. Got you. Haha. :P

dobkeratops commented 5 years ago

sure you must test. but at some point you make a best guess because testing everything takes finite time and the number of permutations explodes.

So you have to prioritise what to test.

I'd be willing to skip testing a 64bit coord version "in case it's faster than 32 bit" because it's very unlikely. I bet even if it was , it certainly wouldn't be on all machines, because not all have good 64bit float support.

besides - for the factors i mention about (2d/3d crossover), i wouldn't want to depend on 'marginally faster 2d' , the priority would be best use of GPUs and flexibility with 2d/3d crossover.

finally, given the trend with AI, which is pushing 16bit support, I find it hard to beleive they would produce execution units where 64bit is faster than 32bit. the machines would be designed to produce 16bit internal products then combine them optionally to 32 or 64bits , surely ... or focus on 32bit, with doubling up for 64 bit and halving to 16 bits

dobkeratops commented 5 years ago

anyway, thats my advice/comment.. the factors and potential directions I can see.. take it or leave it. I hope its useful for anyone else reading here. 64bit coords isn't fatal, sure.

bvssvni commented 5 years ago

@dobkeratops Forgot to test! He forgot to test! :P

dobkeratops commented 5 years ago

he didn't forget to test. he knows in a program 1000, 10,000, 100,000 lines long.. you can't empirically test every choice so you must prioritise and make a best guess for each issue. I'd be willing to bet other factors would have more impact so should be tested first.

i mean you could test for array traversals being reversed, because some processors have better 'decrement and branch' instructions (test vs zero).. or every choice between an enum and vtable (an manual if ladder or other alternatie) and so on and so forth. the implications of every possible inversion of control.. These are things I wouldn't bother testing , yet I bet I could find many things more likely to improve performance than a 64bit coord option ..

bvssvni commented 5 years ago

@dobkeratops Sorry, I could not resist. Some day, I will forget to test, and you will be there taking your revenge.

I think the f32 vs f64 is probably the most famous example of people forgetting to test their assumptions in programming. It's not that obvious that people think it is.

However, testing does not go away as the core of engineering. I think it's important to test things. If you manage to speed up piston-graphics significantly by swapping f64 with f32, then please tell me.

dobkeratops commented 5 years ago

you are right that testing is important. I'm just raising these issues r.e. architectural direction. For me the choice is clear: graphics will most likely be headed to the GPU and I know 32bit is both sufficient , and much more likely to be faster. there's also the issue that the world evolves.

16bit support appeared, but was initially crippled by nvidia to force people to buy their higher end cards (ironic heh, you pay more for less precision, but the result was much higher machine learning performance) but with other ML chips appearing, they could only get away with that for so long, and now the RTX20x0 series delivers fast 16bit in consumer cards. The correct 'engineering theory' crystallised into hardware eventually.

There might be sub-optimal choices in intel CPUs given their history, but its up against ARM in consumer devices and I am sure that 32bits will be faster there. The main demand for 64bit is for scientific calculation (in HPC.. engineering sim/science) - but even there there are scenarios where 'grid resolution' (number of sample points) outweighs 'element precision'

anyway lets keep this discussion friendly. I'm just giving opinions and information. for my own requirements i can roll something simple that does what i need (ultimately just some tiled viewports and the ability to overlay some text and selection boxes). The behaviour of this specific library is not critical to me, I just take an interest.

a broader point, it would be nice if the language got module-level type-params to make a toggle/user choice easier ... it might be worth highlighting this kind of use case for language feature requests. I know that there is a 'code maintainance cost' to the kind of extreme parameterisation i've done (i.e. I have left the option open to swap in 64bit sometimes even where i know 32bit is 90% the right choice). I would personally love module level type-params.

dobkeratops commented 5 years ago

(and types like '24bit int' with spare bits might be interesting for enum padding? although there's issues with borrow for that so maybe not)

dobkeratops commented 5 years ago

the f32 vs f64 is probably the most famous example of people forgetting to test their assumptions in programming.

long term trend: mark my words with moore's law slowing down, bit reduction will return with full force everywhere eventually. it's started in AI and will filter down through graphics and etc https://www.hpcwire.com/2015/10/22/the-case-for-mixed-precision-arithmetic/ https://software.intel.com/en-us/articles/performance-benefits-of-half-precision-floats https://www.comp.nus.edu.sg/~wongwf/papers/HPEC2017.pdf

(and I re-iterate that it should really be an issue for language design that instead of debating a 32 vs 64bit here , the ability to switch via module type params or whatever should be so trivial that no one wastes time arguing about a default. We should both be pestering the rest of the rust community to get module type-param support :) )

dobkeratops commented 5 years ago

one tangent on my extreme vector parameterisation (unreleased to this debate, but repeated to the mental overhead of generics) i note that a 4x3 matrix to place something actually has different units per element?

e.g. the 'rotation' part is unit-less (scaling factors applied to positions which have length units), whilst the 'translation' should have units of "length"? let me check if this kind of detail is handled more elegantly with homogeneous coordinates (does this go away if you're thinking about an implicit 'w=1' )

I know it does sound excessive to have different types per x/y/z .. i would kind of like to simplify it to 'one type for the axes and one for the translation', 'one type for x/y/z and another for w which is usually zero or one..'

that does remind me how type-safety in coordinates can help (are these pixel coords, window or element relative local coords or what ..) so generics or at least new types might be considered

dobkeratops commented 5 years ago

You also have to test it, or else you end up with circular reasoning.

actually this does raise a serious question

Test on what

What is your target platform?

dobkeratops commented 5 years ago

@bvssvni ok now I remember the best argument here.. (I had an instinctive strong negative reaction , coming from many angles..)

If you use f64, you're going to make choices r.e. how the interfaces, traversals etc all work that will rely on that precision (ie. instead of re-ordering the calculations and traversals to get the same result with less), which may constrain the extent to which you could later move it to f32 for the GPU, and constrain your future direction.

The same thing would apply to using 64bit positions in a 3d application.

it's an architectural choice which even if it doesn't slow it down now (I bet in most tests it wont show up in a profiler either way).. it shapes the future path ruling out potential. (e.g. if you then re-implement large parts on the GPU .. then you should be able to do orders of magnitude more in terms of smooth animated transitions , interaction with 3d , etc

Take my current scenario - for a 3d editor i still want 2d 'vertex tags' and perhaps the ability to render text over elements (& connections..) to identify them... it would be crazy to use f64 for all that - the load will quickly explode with perspective views showing large numbers of elements)

quite often code is written in prototype form on the CPU (where you have all the convenience of re-arranging data , using the whole context, and whole type system ) but it's really going to end up on the GPU. With the versatility of Compute Shaders , the GPU is applicable across many areas of general purpose code (and even on the CPU, you've got GPU-style SIMD instructions) .. but if you use f64 you'll be stuck with it on the cpu.

you could still say "it makes it easier to program" but to start saying 'cpus are optimized for 64bit...' sounds like missing information (given you can read f32's 2x as fast from memory.. why would they balance it for anything other than f64 being half the performance?.. and it could be worse in terms of latency)

bvssvni commented 5 years ago

Hmm... affine transforms are just inserting 0s and 1s in the equations. Not sure what you mean.

bvssvni commented 5 years ago

To quote Mike Acton: "You have to know the cost of the problem to reason about it."

There is always a trade-off, but I think extra precision is worth it. However, it's not the problem that people are unaware of benefits using f32 in some cases. It's that they don't even test whether f64 works. This the kind of thinking that make people just assume something and then do it, and sometimes it turns out that f64 is faster.

I've been testing piston-graphics a lot, but guess what? No person who had strong opinions even came close to predicting what would happen correctly. I'm not speaking of the average programmer, but about people who are at the top of their game. I think the reason is that, intuitively, these ideas we have about how computers work seems so real and convincing, but reality is more complex than that. Sure, if some smart and honest guy says something, it's probably correct, but testing does not go away. It's when you leave out testing, that it leads you down the wrong tracks.

I always lose those discussions, because for some reason, it seems so stupid just testing stuff to people. Like, "aren't you a computer engineer" should mean you are that smart and clever guy coming up with clever-sounding ideas. However, when you get really good at programming, you start feeling comfortable with not knowing things. It's the silent person in the room that's the one you should watch out for.

I have this similar experience when talking to people in other domains too. E.g. language design. I want to use a stop watch measuring how long it takes people to do some task, and people laugh of the idea. However, if you're making animation software and every click has 1000x blowup consequences, it's an obvious thing to do. Programmers are typing these things over and over a thousands times, why should they not be treated with same respect as animators? Why should not typing in words be a measure of how ergonomic something is in a language? It's so obvious that, if you type a letter and it takes 0.2 seconds, then if you type a thousand times, its at least 200 seconds. All you need to know to infer that is multiplication. That's why engineers keep their testing as "the secret" that nobody hears about, because it makes engineering look simple. You don't need a long education to actually try out things, and this makes people think that education means something so that you don't need trying out things anymore. I'm just explaining how all these "arrows" of wrong ideas get into people's heads.

However, testing goes deeper than that. The ideas themselves can be tested in many ways, e.g. using mathematics, probability theory etc. For some reason, I see the same resistance there too. Apparently, nobody thought of calculating the benefit of self-merging vs code review. So I did it. It was not hard, and it turns out that self-merging is a huge time-saver. Did it catch on? Of course not. It's the culture of "correct" ideas that gets in the way for productivity. I think we should treat ideas as maybe-wrong or likely-uncertain.

Here is list of things I've recognized:

The list goes on and on about this same mindset. I can show you many other examples.

However, I have experienced doing the same mistakes myself, and how much time I could be spending elsewhere, just because I was in the wrong mindset.

I consider myself a logical assumptionist, which claims that every proof that follows from the assumptions must be correct, or else the assumptions are wrong, or the space of permitted proofs must be restricted. This is because reality is too complex, that you can find one set of assumptions that fits everything. I just love this idea, and wondered whether I should publish a "The Logical Assumptionist Manifesto". However, I think that most people won't even recognize the need for a such thing.

bvssvni commented 5 years ago

Closing because Piston already does this.