Setup command stream for PP uniform

yuq / mesa-lima

Deprecated, new place: https://gitlab.freedesktop.org/lima

https://github.com/yuq/mesa-lima/wiki

165 stars 17 forks source link

Setup command stream for PP uniform #11

Closed anarsoul closed 6 years ago

anarsoul commented 6 years ago

This PR adds command stream setup for PP uniform and also implements codegen for scalar multiplication.

anarsoul commented 6 years ago

See https://community.arm.com/graphics/f/discussions/2340/random-number-with-mali-400-mp:

Something to take into account regarding the Mali-400 GPU is that we implement floating point in the fragment processor as FP16, which is conformant with the Khronos specification

So looks like fragment processor uses FP16

yuq commented 6 years ago

But a following post in your page: One clarification. We do support highp (fp32) and mediump (fp16). For most fragment operations we recommend using mediump; fp16 is "precise enough" for most color-related operations, and uses less memory bandwidth. If you need fp32 for some operations that is still possible of course.

BTW. the constant used in the PP instr is fp16 too (not sure if can be fp32).

yuq commented 6 years ago

Seems mali guys also haven't implement fp32 in their official driver, but the HW really support it. So it is harder in this case that we may not be able to obtain the info by dumping the official driver memory. We may left this for future investigating as there's a lot more important things to be done for this driver.

anarsoul commented 6 years ago

You're right, so it needs to be investigated.

cwabbott0 commented 6 years ago

I'm pretty sure the PP only supports FP16, while the GP only supports FP32. Probably because FP32 usually isn't needed in fragment shaders, and it would either blow up the register file size or make the register pressure even worse.

yuq commented 6 years ago

@cwabbott0 is your conclusion from reverse-engineering the official mali driver or by some HW info? From the above arm post, seems mali guys tell us HW support fp32 but SW not implement it. But it should be a hard reverse engineering work if it's true and we want to enable it by barely guess and try.

cwabbott0 commented 6 years ago

@yuq I don't think there's anyone in that post saying that fp32 is supported in hardware. Someone does say that T6xx supports fp32 and fp16, which is what you quoted, but that's an entirely different architecture.