ValveSoftware / source-sdk-2013

The 2013 edition of the Source SDK
https://developer.valvesoftware.com/wiki/SDK2013_GettingStarted
Other
3.8k stars 2k forks source link

No clobbered registers in FastVertex #247

Open Triang3l opened 10 years ago

Triang3l commented 10 years ago

In CVertexBuilder functions FastVertex and FastVertexSSE (both have two overloads for DX7 and DX8 meshes) in public/materialsystem/imesh.h, clobbered MMX and SSE are not specified in the list.

Since CStudioRender::R_StudioDrawDynamicMesh doesn't seem to do any other x87, MMX or SSE operations, not listing the clobbered registers doesn't cause anything bad now, but still, it would be better to list them properly to avoid possible issues in the future.

Triang3l commented 10 years ago

By the way, maybe implement Fast4VerticesSSE on Linux?

void *pCurrPos = m_pCurrPosition;
__m128 m1, m2, m3;
m1 = _mm_load_ps((float *)vtx_a);
m2 = _mm_load_ps((float *)vtx_a + 4);
m3 = _mm_load_ps((float *)vtx_a + 8);
_mm_stream_ps((float *)pCurrPos, m1);
_mm_stream_ps((float *)pCurrPos + 4, m2);
_mm_stream_ps((float *)pCurrPos + 8, m3);
m1 = _mm_load_ps((float *)vtx_b);
m2 = _mm_load_ps((float *)vtx_b + 4);
m3 = _mm_load_ps((float *)vtx_b + 8);
_mm_stream_ps((float *)pCurrPos + 12, m1);
_mm_stream_ps((float *)pCurrPos + 16, m2);
_mm_stream_ps((float *)pCurrPos + 20, m3);
m1 = _mm_load_ps((float *)vtx_c);
m2 = _mm_load_ps((float *)vtx_c + 4);
m3 = _mm_load_ps((float *)vtx_c + 8);
_mm_stream_ps((float *)pCurrPos + 24, m1);
_mm_stream_ps((float *)pCurrPos + 28, m2);
_mm_stream_ps((float *)pCurrPos + 32, m3);
m1 = _mm_load_ps((float *)vtx_d);
m2 = _mm_load_ps((float *)vtx_d + 4);
m3 = _mm_load_ps((float *)vtx_d + 8);
_mm_stream_ps((float *)pCurrPos + 36, m1);
_mm_stream_ps((float *)pCurrPos + 40, m2);
_mm_stream_ps((float *)pCurrPos + 44, m3);
Triang3l commented 10 years ago

Oh, well, I guess clobbers are not required because of emms. But still, needs more research.