Open DiabeticCrab opened 10 months ago
This is likely a fallout from the HVV promotion, but we have zero knowledge up front what the intended usage of the UPLOAD heap is due to poor D3D12 API design, so there is nothing we can do here without app-opting it. It's likely a case where hammering a ton of data over ReBAR memory is slower than using the decompression shaders to pull bulk data over the wire.
Software information and Problem
BulkLoadDemo, also trivially known as "Avocado Benchmark", downloaded from Google Drive, 65MiB, achieves around 10GiB/s of throughput when launching it as is with
wine Release/Output/BulkLoadDemo/BulkLoadDemo.exe
.When preventing the use of ReBAR:
env VKD3D_CONFIG=no_upload_hvv wine Release/Output/BulkLoadDemo/BulkLoadDemo.exe
, I achieve more than double the bandwidth (21+ GiB/s)!Additionally exporting
RADV_PERFTEST=nosam
does not alter the behavior and measurement results.This may suggest that VKD3D-Proton requires specific settings presets for games utilizing DirectStorage, or that something is going on with ReBAR/host visible VRAM. Is it explainable why NOT using hvv helps performance so much? Shouldn't be the opposite the case? It doesn't make any sense to me.
System information
I used a Samsung 990 Pro 1TB on a PCIe 4.0x4 bus, aswell as tmpfs in order to determine whether my drive had any measurable bandwidth impact. It didn't.
Logs
Console output