Dawoodoz / DFPSR

Fast realtime softare rendering library for C++14 using SSE/AVX/NEON. 2D, 3D and isometric rendering with minimal system dependencies.
https://dawoodoz.com/dfpsr.html
83 stars 6 forks source link

VLA fallback solution #19

Open Dawoodoz opened 4 years ago

Dawoodoz commented 4 years ago

Triangle rasterization uses small but dynamic arrays for storing pixel intervals for each row without having to fetch memory far away on the heap.

In case that the VLA C extension can suddenly no longer be used in the distant future (new CPU architecture with new conflicting feature, et cetera), it would be good to have a fallback implementation for simulating or replacing VLA when not available (just like the SIMD abstraction runs with zero overhead when not having the extensions).

A global stack on the heap would not work when called from multiple threads breaking the call order.

Carrying thread contexts would be a horribly entangled spaghetti design.

Allocating on the heap per triangle would be compact, but also horribly slow if ending up with cache misses from another thread stealing the address space. Pre-allocating the height of the target's section with even padding would have enough room for the worst case triangle height and have no allocation overhead per triangle, but this would not be easily reusable for other problems needing VLA.

Dawoodoz commented 4 years ago

Using alloca instead of VLA would make the code more standard, while still getting the speed of stack memory.

Dawoodoz commented 1 year ago

Maybe just restructure rasterization, so that the pixel intervals are sent directly to the pixel shader using a function pointer for filling two rows of pixels. Then no need for VLA when rendering triangles.