Traverse-Research / gpu-allocator

🦀 GPU memory allocator for Vulkan, DirectX 12 and Metal. Written in pure Rust
https://traverse.nl/
Apache License 2.0
374 stars 49 forks source link

Size of fully vendored `gpu-allocator` with `"d3d12"` is quite large #181

Open jimblandy opened 11 months ago

jimblandy commented 11 months ago

Firefox would very much like to use gpu-allocator in its Direct3D12 backend for WebGPU on Windows, but we cannot:

As a result, Firefox's WebGPU implementation uses wgpu's Direct3D12 backend without enabling the windows-rs feature, meaning that we use an older, slower path that does not suballocate buffers. We'd like to put this behind us and begin using gpu-allocator to do things right.

Right now, running cargo vendor in a crate that simply depends on gpu-allocator with the "d3d12" feature enabled produces a 238MiB vendor subdirectory.

Is there any way that gpu-allocator could reduce the size of its vendored footprint?

Those are just two strategies that occur to me, but any tactic to avoid bringing in windows-rs will serve.

Mozilla is willing to do the work. We're just looking for a plan that would be acceptable for inclusion in this code base; forking would be a waste of time and attention in all the generally-understood ways.

Thanks very much for considering our situation!

Background and prior discussion

107 goes into the reasons gpu-allocator migrated to windows-rs - a move that makes perfect sense.

gfx-rs/wgpu#3207 is the wgpu issue covering this question.

cc: @erichdongubler

Elabajaba commented 11 months ago

Afaik bindgen isn't really usable for public apis because you end up with unique types that aren't compatible with the ones from eg. windows-rs or winapi.

edit: The only solution for this I've been able to come up with is a gfx-windows crate (or just reworking the d3d12 crate) of windows-bindgen generated bindings to the needed dx12 subset of windows-rs. It's a crappy solution since it wouldn't interoperate with people using windows-rs, and requires either forking or getting upstream crates to maintain a gfx-windows path (and releasing when wgpu needs a release so it can bump the bindings version).

MarijnS95 commented 11 months ago

We previously had a winapi interface in the public API (maybe we still have it) and could do the same for windows by generalizing the default interfaces to use raw Rust types (e.g. void pointers) and rely on the COM bindings via windows-bindgen internally. Then, an optional windows-rs feature could turn on the convenient/type-correct interfaces, even if this requires casting from that type to the internally generated (but identical) type on every API entry point.

MarijnS95 commented 11 months ago

We did try another approach on the microsoft-directx branch where a more minimal d3d12 crate generated from the Agility SDK was used, but that effort was ultimately abandoned since the Agility SDK bindings are now immediately consumed by win32metadata and windows-rs.

Jasper-Bekkers commented 11 months ago

Thanks for filing the issue; I understand the concern and we're also constantly fighting windows-rs because of the compile times and it's size. For us, however, winapi is a dead-end street. I'm not willing to invest in it anymore and I'd rather work with Microsoft to resolve the issues around windows-rs then to keep supporting winapi. (We used to do this, but we had to keep painfully add new Dx12 apis to winapi every time we wanted to use them so we abandoned it).

Therefor, I'm very open to the suggestion of replacing windows-rs with windows-bindgen, I think it's the path forward for most of these kinds of "low level" crates. I think before we go down that path, I'd like to highlight that because Traverse Research is doing mostly cutting-edge research into state of the art rendering we'd like to always keep the option to use whatever new Dx12 api's become available either through version updates or agility sdk updates. Having said that, I'm ok with having to "opt in" to those features for example (through build-time features even), but we should at least have an easy way to update the bindings if we do go down this path.

@jimblandy and @Elabajaba If we go down the path of having Firefox ship with gpu-allocator (something that sounds appealing to me), I think it would be good to set up some more process on our end as well and potentially make one of you two maintainers on this project. We're currently only using gpu-allocator with our own renderer (so not with wgpu) for example and we could do with a bit more testing as part of our release process. I'm not entirely sure what the expectations are on Firefox' side but I suspect they to be a bit more stringent then just slinging a new release every once in a while.

jimblandy commented 11 months ago

For us, however, winapi is a dead-end street. I'm not willing to invest in it anymore

Understood. This is not unexpected.

I'm very open to the suggestion of replacing windows-rs with windows-bindgen, I think it's the path forward for most of these kinds of "low level" crates.

Okay. Then this sounds like our best path.

Yes, Firefox would need gpu-allocator to work on plain Windows installations, without extensions like Direct3D's Agility SDK (link for people like me who had to look that up). Given that Cargo features are supposed to be additive, having the shiny new things be "opt in" would probably be necessary. Maybe a single "agility" feature, which you could extend however you like, would be all that's necessary?

I don't think Firefox would need too much process from Traverse; most of the burden will fall on us. We already depend on lots of third-party crates much less organized than this one. We have an internal vetting process we must follow, driven by cargo vet, which requires a full audit for the first import and then incremental audits as we update. Releases would be helpful, but don't need to have everything on crates.io; we're already bringing in specific git commits of wgpu anyway.

At the moment Mozilla's attention is on just getting our implementation stood up the way it needs to be (good CTS results in CI, etc). I think it's going to be a few months before we'll be seeing issues like efficient buffer allocation showing up at the top of our lists. At that point, I'd probably ask my co-worker @erichdongubler to take point on this work.

@Elabajaba is an independent wgpu contributor, not associated with Mozilla, so if they want to work on this, that would be lovely - but that's up to them.

MarijnS95 commented 11 months ago

Let's be more clear here. The Agility SDK (with shipping out-of-Windows DLLs and exporting magic symbols from an exe) isn't something that has to be used. Rather, the canonical latest version of the Direct3D12 API is provided by Microsoft out-of-tree at https://github.com/microsoft/DirectX-Headers. It contains types and COM objects for newer interfaces on objects, but those are optional and cannot even be used if QueryInterface() doesn't give the handle containing that interface to you in the first place. You're in the clear here for plain Windows installations as long as you manually audit and limit what interface versions are used (and hence what functionality is available to you).

A great example of this is #168 where we use enums to accept a newer D3D12Device object to use the new (from enhanced barriers) BARRIER_LAYOUT type insted of the older RESOURCE_STATES, but it's optional.

Hence I don't think there's a need for an agility feature unless we really use cutting-edge features inside gpu-allocator (enhanced barriers should be well stabilized, but support is optional at runtime anyway).


Rather, I think there could/should be a feature that enables some convenience APIs that allow users to pass windows types into this crate, rather than untyped void pointers, or reexporting our locally-generated D3D12* types. Their layout, behaviour and implementation is the same (identical) but Rust will consider them different because they're defined in different places.


Back on topic: that microsoft-directx crate/branch serves as an older example. A more recent example from us employing windows-bindgen is https://github.com/Traverse-Research/amd-ext-d3d-rs/commit/fa55fdb98772c07f4ba0825a4a5f770131a09c05 but it generates from custom metadata that utilizes windows-rs types rather than generating a minimal windows-rs crate with just D3D12 code. It should only be an argument or two in bindings.txt to get there though. This crate will then have to depend on windows-core for the most basic of Winodws COM functionality - is Mozilla okay with that? It's a requirement for windows-bindgen but I've only seen an ack for windows-sys thus far.