Closed kvark closed 3 years ago
I ran the example on my computer, in case the extra data points help
Results for 1K by 1K: | API | OS | Hardware | image->image | buffer->image |
---|---|---|---|---|---|
DX12 | Win10 | Gtx1080 | 573 | 832 | |
DX12 | Win10 | Gtx1080 | 472 | 832 | |
Vulkan | Win10 | Gtx1080 | 479 | 22 | |
Vulkan | Win10 | Gtx1080 | 480 | 22 |
Results for 512 by 512: | API | OS | Hardware | image->image | buffer->image |
---|---|---|---|---|---|
DX12 | Win10 | Gtx1080 | 136 | 211 | |
DX12 | Win10 | Gtx1080 | 119 | 207 | |
Vulkan | Win10 | Gtx1080 | 142 | 6 | |
Vulkan | Win10 | Gtx1080 | 133 | 5 |
bors r=tangm,kvark
Build failed:
bors r=tangm,kvark
Introduces a new example for benchmarking small transfers! Fixes a small case where Linearly tiled image is created on the known memory. This is still a hack, but more useful than the old code.
Interestingly, the AMD machine on windows totally craps out when the total number of copy regions exceeds 4M per submission. So I had to keep the size down.
On Metal and DX11, timestamps are not implemented yet.
PR checklist:
make
succeeds (on *nix)make reftests
succeeds