Closed austinbhale closed 1 year ago
I'll look into this in detail in the morning, thanks a bunch for writing this up! The ARM vs ARM64 thing is ...unexpected to say the least, the big difference being that ARM64 uses .NET Native, and ARM generally does not.
I'll also note that you can specify more than one TexType
, it's a bit flag. So TexType.Dynamic | TexType.ImageNomips
will get you both at the same time.
Edit: Also Single vs. Atlased shouldn't make a difference here, you should see logs in the console noting that Atlased isn't implemented, and it falls back to Single anyhow :)
Thanks so much for the speedy response!
I did a big facepalm after seeing the docs about Atlased not being implemented after making the table 😅 That's awesome that TexType
is a bit flag!
So, I just tested the ARM application again with the TexType
set to TexType.Dynamic | TexType.ImageNomips
and still am receiving the following outliers:
min: 3ms, max: 46ms, median: 5ms
printing top 5 longest times:
#95: 7ms
#96: 7ms
#97: 8ms
#98: 10ms
#99: 19ms
#100: 46ms
These outliers are my main concern for creating the issue, but the ARM vs ARM64 comparison just peaks my curiosity too!
I also noticed calling Mesh.SetVerts
without calculating the bounds looks like it should simply be copying data in:
However, it could be that the Map operation is taking a varying amount of time: https://stackoverflow.com/questions/40808759/id3d11devicecontextmap-slow-performance
Perhaps an option to specify the D3D11_MAP_FLAG_DO_NOT_WAIT
flag could boost performance? <- src.
I tested out a very similar design to this sample program for the ARM architecture on HL2 and received outliers that are a factor of ~3-10 of the median time.
MeshRenderer: 163ms
MeshRenderer: 30ms
MeshRenderer: 31ms
MeshRenderer: 27ms
MeshRenderer: 106ms
MeshRenderer: 27ms
MeshRenderer: 27ms
MeshRenderer: 28ms
MeshRenderer: 41ms
MeshRenderer: 20ms
Yeah, I do suspect that a lot of the issue here is related to interaction with the GPU, which can be kinda complicated. Basically, if the GPU is busy doing something like drawing the previous frame, then it has to wait until the GPU unlocks before a Map
can occur.
D3D11_MAP_FLAG_DO_NOT_WAIT
is perhaps part of the solution, but that flag IIRC should cause the Map
call to error out instead of waiting! You can then attempt to Map
later on when the device isn't busy, but rescheduling that task is not exactly simple in this case. There needs to be some amount of interaction with SK's async asset pipeline here to let these slide until later in the frame, or to the next frame.
Description
When testing my HL2 application, I started to notice my color image sprites would occasionally have "performance hiccups" in their IStepper's
Step
function. If this function call takes too long in the main thread, it will block all the other ISteppers from rendering. I then noticed the inconsistency in performance for their texture'sSetColors
method. The main variables to take into consideration for quicker updates of a texture's color data seemed to be:Platform / Environment
Test Program
A simple UWP C# single-file (
Program.cs
) application was used for testing:Logs or exception details
All timing measurements were made over 100 render frames. For each variable, I performed the measurement 3 times and kept the most average-looking results.
Legend
n
framesn
framesn/2
n
framesDiagnostic times
Discussion
Dynamic
andImageNomips
from the start? Could this be contributing to the two outliers as seen in ARM -> Dynamic -> Atlased/Single?~