No clobbered registers in FastVertex

Triang3l commented 10 years ago

In CVertexBuilder functions FastVertex and FastVertexSSE (both have two overloads for DX7 and DX8 meshes) in public/materialsystem/imesh.h, clobbered MMX and SSE are not specified in the list.

Since CStudioRender::R_StudioDrawDynamicMesh doesn't seem to do any other x87, MMX or SSE operations, not listing the clobbered registers doesn't cause anything bad now, but still, it would be better to list them properly to avoid possible issues in the future.

Triang3l commented 10 years ago

By the way, maybe implement Fast4VerticesSSE on Linux?

void *pCurrPos = m_pCurrPosition;
__m128 m1, m2, m3;
m1 = _mm_load_ps((float *)vtx_a);
m2 = _mm_load_ps((float *)vtx_a + 4);
m3 = _mm_load_ps((float *)vtx_a + 8);
_mm_stream_ps((float *)pCurrPos, m1);
_mm_stream_ps((float *)pCurrPos + 4, m2);
_mm_stream_ps((float *)pCurrPos + 8, m3);
m1 = _mm_load_ps((float *)vtx_b);
m2 = _mm_load_ps((float *)vtx_b + 4);
m3 = _mm_load_ps((float *)vtx_b + 8);
_mm_stream_ps((float *)pCurrPos + 12, m1);
_mm_stream_ps((float *)pCurrPos + 16, m2);
_mm_stream_ps((float *)pCurrPos + 20, m3);
m1 = _mm_load_ps((float *)vtx_c);
m2 = _mm_load_ps((float *)vtx_c + 4);
m3 = _mm_load_ps((float *)vtx_c + 8);
_mm_stream_ps((float *)pCurrPos + 24, m1);
_mm_stream_ps((float *)pCurrPos + 28, m2);
_mm_stream_ps((float *)pCurrPos + 32, m3);
m1 = _mm_load_ps((float *)vtx_d);
m2 = _mm_load_ps((float *)vtx_d + 4);
m3 = _mm_load_ps((float *)vtx_d + 8);
_mm_stream_ps((float *)pCurrPos + 36, m1);
_mm_stream_ps((float *)pCurrPos + 40, m2);
_mm_stream_ps((float *)pCurrPos + 44, m3);

Triang3l commented 10 years ago

Oh, well, I guess clobbers are not required because of emms. But still, needs more research.

ValveSoftware / source-sdk-2013

No clobbered registers in FastVertex #247