Closed kvark closed 6 years ago
what do we do about generic support for backends (current Backend trait)? Ash doesn't have generic-based API
You probably could implement the DeviceV1_X traits for hal. But I haven't designed it to be a generic backend, but I am open to changes. I guess a more static interface would make more sense here.
I haven't really used hal that much but I like what I saw. It feels like vulkan but with bit more "rusty" api than ash.
How would hal differ from the portability lib? How would vulkan extensions work in hal?
I am also willing to move ash into the gfx-rs organization.
I tend to agree that our layer become very thin mostly because it's pretty clear now that we can implement ~90% of Vulkan without major obstacles and can stick to the API pretty closely. Ash is pretty close to Vulkan. On the other hand I'm not sure if we just expose the raw C API in the backends:
At the end we can still keep the current gfx-hal API with the generic backend interface (using macros). We basically would invert the current hierarchy: gfx-hal implemented on top of portability.
I see lots of great input, thanks for prompt responses!
@MaikKlein
You probably could implement the DeviceV1_X traits for hal
That's a cool idea, we should try it out.
How would hal differ from the portability lib?
The only difference, I suppose, would be that gfx-rs is still a Rust API.
How would vulkan extensions work in hal?
The same way they work in Ash: exposed to the user and then requested explicitly. I suppose the "requested" part is actually at the type level in Ash, right? In this case, the portability layer would just have a bunch of Option<SomeExtension>
populated when it's requested by the user.
I am also willing to move ash into the gfx-rs organization.
Awesome! That would be highly appropriate if we go this (wedding) path.
@msiglreith
On the other hand I'm not sure if we just expose the raw C API in the backends
That's another crazy idea, right here :) Wouldn't it imply that all the Rust apps then have to cross the C boundary twice when basing on gfx-rs? First, between Ash and the backend. Second, between the backend and the native C API. This would be a regression from what we have now. Another negative side is that each backend would have to deal with unsafe inputs all the time, which means more boilerplate/redundancy and less safety (obviously). As for the listed benefits, the former is largely negated having by gfx-hal on top of the portability (roughly the same code as gfx-hal below the portability, it seems), if we keep it. And the last two benefits seem to be equally true for the current portability library, don't they?
We basically would invert the current hierarchy: gfx-hal implemented on top of portability.
What I was trying to say in the issue is that I don't see much value in whatever HAL provides on top of raw backends at the moment. It's another "gfx-render" story, just lower level. Better shoot straight for the usable higher level APIs from here.
The big architectural concern I missed originally is that Ash types are copyable, and HAL takes everything by reference. This allows us to have bigger structs with custom data used by the backends, but it also gives us a bit of a headache (e.g. we still require CommandBuffer: Clone
, which is ugly and annoying). If we are to make those types all be basically usize
(or *mut _
), the consequences are going to be:
Send
/Sync
? (I guess this applies to Ash wedding in general)Handle
/Box
like @msiglreith mentioned)Handle
/Box
wrapping internally... which is a clear regression from the current use of gfx-hal (from Rust).The last point makes me especially worried... One of those clients is WebRender, and we simply can't afford to drop the performance here without a strong reason to do so.
I like this idea, in general, HAL does seem to me to be a bit confused at the moment. As far as I see it, there are two ways of doing it, at least that I can think of:
HAL is the Ash traits (which interestingly could probably be generated), a very thin portability layer which essentially just converts the C ABI to the Rust ABI, and the backends implement the Ash API (the Vulkan backend would hopefully get optimized away).
HAL essentially being most of Ash with the backends implementing the Vulkan C API. So things wanting to use VkPortability can just use the backends without HAL at all. This may be harder to manage as their can't be any guiding traits.
The first way seems to be the more sensible option as it uses more of Rust's strengths and involves fewer ABI changes. However, the second method is less reliant on the needed optimisations being performed as it involves fewer steps in the VkPortability case.
I'm confused as to why handles and boxing is going to go away. They are part of the Vulkan spec and are going to have to exist in some form. I'm thinking this change is going to move more control into the backends which I would assume can only decrease overhead.
I like this idea, in general, HAL does seem to me to be a bit confused at the moment.
yes due to its history, originating from creating a middleground between those APIs similar to the Forge project. The VkPortability stuff wasn't on the table back then.
I'm confused as to why handles and boxing is going to go away.
Handle stuff is in the portability branch only, HAL basically returns the pure objects.
Wouldn't it imply that all the Rust apps then have to cross the C boundary twice when basing on gfx-rs?
App -> Ash -> HAL, no C-boundaries involved considering changing the trait interfaces/implementations of Ash a bit.
Another negative side is that each backend would have to deal with unsafe inputs all the time, which means more boilerplate/redundancy and less safety (obviously).
Yes, kind of. We would need to due to ptr->slice conversions in the backends. Based on my experience with rostkatze I would say it's marginal. My C++ coding style became quite rusty and I didn't feel like a major concern to me. Another regression in general will be lack of iterator based interfaces but I don't think we can do much about it.
As for the listed benefits, the former is largely negated having by gfx-hal on top of the portability (roughly the same code as gfx-hal below the portability, it seems), if we keep it.
I tend to agree that keeping the current HAL implementation would be a bit more work. But compared to the current situation the bottom layer API would be fixed, requiring only version updates.
And the last two benefits seem to be equally true for the current portability library, don't they
mmh yes, but having a wrapper of a wrapper is a bit more scary :D
What I was trying to say in the issue is that I don't see much value in whatever HAL provides on top of raw backends at the moment.
I fully agree on that point. IMO the maintenance costs grew over the time: Changing bits of the current API is quite costly in terms of updating depending crates additionally to trying coming up with some sort of rusty API. My hypothesis is that Ash over raw C doesn't provide enough benefit:
the (non-Vulkan) backends would then have to do the Handle/Box wrapping internally... which is a clear regression from the current use of gfx-hal
Via the host allocator interface it should be possible to mostly negate the costs.
@msiglreith thank you for clarification! I think I understand now the idea - you are basically saying that each backed will have a different implementation for gfxXXX
functions (we currently have in portability) and the basic types, correct?
Via the host allocator interface it should be possible to mostly negate the costs.
This is a particularly interesting (and promising) point!
As @AIOOB correctly identified, we have multiple options now:
Trying to bring more structure to the arguments voiced in the discussion, I came up with this table:
HAL | Ash | Portability | |
---|---|---|---|
User base | us | us + Ash | us + Ash + Vulkano |
Rustyness | good to have iterators and borrowed objects | slices are good, but objects have to be passed by raw pointers | zero |
Safety | pretends to provide some with type sugar, but still horribly unsafe | not safe | totally not safe |
Overhead for Portability | all types are boxed | zero | N/A |
Overhead for Vulkan | need to collect iterators into temporary vectors | N/A | N/A |
Overhead for others | minimal | all types are boxed | all types are boxed |
Maintain HAL and Warden | yes | possible | no |
First of all, the current HAL approach is not very sound. It has all the bits and nits of Vulkan in it, with the associated unsafety, which is not consistent with the type semantics/constraints we are trying to preserve... For example, the clear color iterator doesn't have to be of the same length as the number of color attachments, yet some of the elements are ignored. Or the fact our CommandBuffer: Clone
while other resources are not, and there should no be reason for this one to be cloned.
Secondly, with Ash having the traits already (and some user base), it's tempting to try implementing those. Dealing with slices are a pleasure, and Ash has nice object-oriented API that is nice to work with. However, it does require us to box all the resources like we do in Portability, turning them into bare mutable pointers that have to be unsafely(!) dereferenced at every spot... so it's not like we can have even a minimum type safety in the backend implementations.
That last point is tipping my scales towards the last solution, ever slightly. I'll keep chewing through those in the meantime and appreciate more input. Overall, I'm glad we haven't rushed with HAL after all :)
After turning it over for a while I do think the “portability” option seems like a good one... in addition, Vulkano and/or Ash could probably be written on top of this “new HAL”/portability lib and get the benefits of the cross-platform support while paying little to zero overhead when actual Vulkan is available. Also, documentation and knowledge base/user base would be able to be shared with Vulkan basically directly so that it’s not just “if you know Vulkan you can pick it up pretty easily,” but rather “if you know Vulkan then you know this.”
Related github issue in Vulkano: https://github.com/vulkano-rs/vulkano/issues/525
I think I got the "User base" metric wrong. Since the portability layer is always going to be there, Vulkano and others can just use that (either the Rust library or straight as a dynamic system lib) anyway (and they aren't really blocked to do so even now). So having us to use it as a basis for backend implementations doesn't change anything for those users. Please correct me if I'm missing something... With this out of the way, let's focus on the remaining differences between Ash and Portability paths.
Ash:
Handle
-like structure (currently used in Portability)Portability:
Here is a more detailed transition plan for Ash that I think is reasonable:
unimplemented!()
- to be discussed).ash
crate into gfx-rs organization, have it defining those function pointers and implementing the traits through them. Perhaps, it should be a part of the same repository?..Handle
) need to be moved over (into some common
/auxil
module). I'm also wondering... given what's going to be left of Ash and Portability (if following this plan) is mostly dealing with function pointers of Vulkan, maybe those two at least need to be together in a repo?Ideas, feedback, and alternatives plans of actions are welcome!
It seems like the current discussion is about ash vs raw-c for the implementation of the backends. Independent of the chosen way we will provide an Ash and a portability interface to the users. So for me the main points are the ease of implementation for backends + layers and maintenance in form of Vulkan 1.1+ support.
Therefore the 2 paths would be backend -> Ash -> RawC
and backend -> RawC -> Ash
:
backend->Ash
vs backend->RawC
Ash->RawC
vs RawC->Ash
Mostly a matter of how good we can autogenerate these layers. Not aware of the Ash internals here. cc @MaikKlein
I'm not really favoring on approach over the other atm as it's hard for me to estimate the work required at the ash side and autogeneration. Both interface implementations (Ash and RawC) should at the live in the same repository at the end.
The traits in Ash all have a default implementation based on the function pointers. If you go the RawC way I think I could abstract over the "function pointers".
trait InstanceRawC {..}
trait DeviceRawC {..}
//...
impl InstanceRawC for FunctionPointers {..} // This are the function pointer in ash
impl DeviceRawC for FunctionPointers {..}
pub struct Gfx { .. }
impl InstanceRawC for Gfx {..}
impl DeviceRawC for Gfx {..}
Where it sort of could look like this
// This is the function pointer struct in ash
pub struct InstanceFnV1_0 {
destroy_instance:
extern "system" fn(instance: Instance, p_allocator: *const AllocationCallbacks) -> c_void,
//...
}
impl InstanceRawC for InstanceFnV1_0 {
unsafe fn destroy_instance(
&self,
instance: Instance,
p_allocator: *const AllocationCallbacks,
) -> c_void {
(self.destroy_instance)(instance, p_allocator)
}
//...
}
pub struct Gfx {
// Potentially empty
}
impl InstanceRawC for Gfx {
unsafe fn destroy_instance(
&self,
instance: Instance,
p_allocator: *const AllocationCallbacks,
) -> c_void {
gfx::destroy_instance(instance, p_allocator)
}
//...
}
I possibly have to create multple traits for each Vulkan version but that shouldn't be a problem. I think generating something like this should be trivial.
The arguments possibly have to be transmuted because the portability lib currently defines their own types.
I haven't put too much thought into this, but I think something like this should work. I would need to see how you would implement the other way, maybe with some pseudo code, then I can judge how easy/hard it is to generate the implementation.
I am also not too familiar with gfx to understand what exactly you want to do, if I got the RawC -> Ash
implementation wrong, just tell me.
RawC is a fixed API, how for is Ash away from stabilizing (error handling, extensions, completeness)?
Now that I have the generator branch almost done, I also wanted to write a small script that checks which parts of vk hasn't been exposed in the higher level layer. I am also thinking of generating the high level layer, instead of manually wrapping all the functions. I am just not sure yet how much work that would be, especially with all the corner cases.
I've just redone bitflags and enums, you can also follow my process of transitioning to 1.1 here https://github.com/MaikKlein/ash/issues/52#issuecomment-403392116. I am fairly open about all the changes.
As for extensions, the generator branch exposes everything, but many higher level extensions haven't been written yet.
As for the error handling rework, I assume you are referring to https://github.com/MaikKlein/ash/issues/41. I haven't had the time to work on it, but any help is appreciated. I am also toying with the idea of custom errors for every function.
Also the dynamic library loading needs to be reworked. Ash currently uses a yanked libloading library and this is one of the problems why the Vulkan backend of the portability lib stack overflows.
Btw if you would like to see any bigger changes to Ash, now is the time to voice them.
Thanks for the detailed response! Looks promising and nice to see that you are working on generating things from xml already.
It seems for the RawC path the best way would be something like:
// dx12/lib.rs
pub fn create_instance(..) { .. }
// ash_wrapper.rs - autogenerated
impl InstanceRawC for Gfx<BackendDx12> {
unsafe fn create_instance(&self, ...) { dx12::create_instance(..) }
}
// portability.rs
// export the required 3(?) functions
// portability_static.rs - autogenerated, optional static library for best performance
#[no_mangle]
pub extern "C" fn vkCreateInstance(..) {
back::create_instance(..)
}
If I understand it correctly the backend->Ash route would instead target the current Instance
traits on the higher level in contrast the the backend->RawC route implementing InstanceRawC
?
@MaikKlein is InstanceRawC
only needed because you can't auto-generate the implementations of the actual InstanceV1_0
and such?
I feel like the RawC (what I called Portability) and Ash paths aren't really that different. If we consider the total work that needs to be done by {gfx-rs, gfx-portability, ash}
projects, than the chosen path only re-shuffles the work by doesn't add or subtract any. And we'd like to piggy back on the same generator infrastructure that @MaikKlein is rolling out, so taking Ash work into consideration makes sense to me (regardless of the chosen path).
If we go RawC, then:
XxxRawC
-> gfx path is auto-generatedXxx
-> XxxRawC
still needs to be written by handIf we go Ash, then:
gfxSomeFunction
things (like we have now), and then static/dynamic VK bindings auto-generated (like in RawC path).XxxRawC
are not neededSo the amount of work seems to be roughly the same. The Ash path doesn't need XxxRawC
traits, which is a win, not to mention nicer backend implementations for the slices in the API.
After wondering about why Ash specifically only has traits for the instance and device, I figured we need to place our choices on a linear scale according to the amount of type sugar they add over raw Vulkan:
pub extern "C" fn gfxCreateRenderPass(VkDevice, *mut VkAttachment, u32)
. We already have those in portability, and this is precisely what "RawC" path (discussed above) is about.pub fn create_render_pass(VkDevice, &[VkAttachment])
. We are unlikely going to be able to generate RawC -> this code, so from here and below we'd need some hand-written wrapping to/from RawC.Instance
, PhysicalDevice
, Device
, Queue
, and CommandBuffer
, e.g. fn create_render_pass(&self, &[VkAttachment])
. All objects (including the dispatch-able ones) are still passed around by their (copyable) handles. Dispatch-able objects can enforce the Send/Sync bounds.Backend
trait, passed by reference (or Borrow
) everywhere. All objects enforce Send/Sync.The current HAL (or rather, it's raw layer that backends care about) is something akin 4.5: it goes an extra mile by introducing more traits (for things like a swapchain or descriptor pool) and accepting generic iterators instead of slices. Naturally, we could consider going strictly to 4 as one of the ways to proceed. It's probably the least disrupting one, which is a small plus. The problem with it is some overhead due to the handle wrapping happening on the portability side: it needs to collect the reference lists somewhere before passing to HAL.
Interestingly, Ash is around 2.5: it has only two (out of 5 specified) dispatch-able objects. I find this a bit unsound (for our needs), and the reason for this design choice seems to come from the fact Vulkan function pointers are obtained for/by the device and instance. This is a fine argument for Ash, but it's not a convincing argument for us in terms of Ash traits since we specifically want to disconnect the traits from function pointer logic.
With this in mind, the only really sound approach, involving no overhead on Vulkan or portability by itself, is 1 (RawC). I am wondering though, could it be possible to auto-generate the traits for 3? In this case, we'd get the benefits of direct mapping as well as nicer types.
How will the proposed changes affect the future of the opengl backend? Will the proposed changes require opengl backend to map better to ash stuff? Will window/surface/instance creation HAVE to be merged between the opengl backend and non-opengl, or will it stay as a nice to have (and a requirement for using it in portability, of course)?
@kvark You can also use Ash as 1. As raw C functions are exposed.
@ZeGentzy I think it will still be fine for the GL backend to provide specific initialization routines, if we don't find a way to fit it into Vulkan API completely.
@omni-viral In this case, Ash provides nothing to us - we already have those functions in portability.
With this in mind, the only really sound approach, involving no overhead on Vulkan or portability by itself, is 1 (RawC). I am wondering though, could it be possible to auto-generate the traits for 3? In this case, we'd get the benefits of direct mapping as well as nicer types.
Is it possible to code-gen both the traits and C externs (and implementations of these that call our traits)? Then we just have to be concerned with aligning HAL and Vulkan, and most of the existing portability is simply code-gen'd.
This is a fine argument for Ash, but it's not a convincing argument for us in terms of Ash traits since we specifically want to disconnect the traits from function pointer logic.
I am fine with disconnecting it. I just implemented it that way because it was easy. I would just do a generic impl for Ashs function pointers.
Interestingly, Ash is around 2.5: it has only two (out of 5 specified) dispatch-able objects. I find this a bit unsound (for our needs), and the reason for this design choice seems to come from the fact Vulkan function pointers are obtained for/by the device and instance.
I am open for ideas but I am not sure how nicely this would map to Ash. I'll have a look at the HAL traits to get an idea how you are using these dispatchable objects. I assume this is not a problem for the other backends because all the function pointers are loaded globally? We could do something similar in Ash, if we get function pointers with "runtime dispatch". (Retrieve device level fp from the instance). Otherwise we would need something like an Arc
in those objects. Currently instance and device own their function pointers.
I find this a bit unsound (for our needs),
Could you explain this statement? What exactly would make it unsound? I assume you might need to abstract over those "dispatchable" objects? Like impl CommandBuffer for MetalCommandBuffer
?
With this in mind, the only really sound approach, involving no overhead on Vulkan or portability by itself, is 1 (RawC). I am wondering though, could it be possible to auto-generate the traits for 3? In this case, we'd get the benefits of direct mapping as well as nicer types.
As you mentioned 2/5 traits are already in Ash, but they are all written manually. I think it should be possible to generate them but it is a bit painful as you don't really have any type information. I am also not sure how many edge cases there would be.
We had a call today, discussing the positions on this issue. I was about to write up about our conclusions and close the issue, but then something unsettling got in - the feeling we haven't captured all the right arguments for the chosen paths yet.
The first misunderstanding that I got is about Ash rustiness: "slices are good". Ash only has slices at the top level, and anything deeper (e.g. structure fields) are still raw pointers. So Ash doesn't get us much further on the safety/convenience scale than RawC.
The next misunderstanding, and it may be a crucial one, is that making the backends operate on raw handles would be just shuffling the work around and not have any impact on the end users in terms of performance. This is true for the portability - either the handle dispatches happen before the backend boundary or after - doesn't matter. But this is certainly not true for any Rust user code. Ability to move the actual structs around and wrap them into bigger logical constructs (e.g. WebGPU implementation) can be saving quite a bit of indirection. I don't want to penalize Rust applications running on Metal/D3D12, and I think the existing move semantics in HAL is both convenient and useful. Perhaps, the unsoundness of some of the pieces could be patched instead of throwing out the whole thing?
So here is an updated proposal for the (current) HAL way:
Backend
typesHandle
wrapping. It may do so more efficiently, and it make take full responsibility to use the allocator callbacks to place those concrete types in user-defined memory.I think the decision largely depends on one metric - the ratio of users using our Vulkan backend, and their performance expectations. Note that this only applies to Rust users - portability layer doesn't care about the Vulkan backend as much. If our users want to target native Vulkan and consider everything else a fallback on those walled garden ecosystems - then going with RawC makes sense. If the users want to choose the popular platforms and run their preferred APIs, then the current HAL makes more sense.
If our users want to target native Vulkan and consider everything else a fallback on those walled garden ecosystems
This match my perspective pretty well, but I'm just a single user and have yet to start using gfx-rs in production.
@anderejd my main case of consideration is the browser tech.
WebRender is written in Rust and is much higher level than gfx-hal. It will enjoy the fewer indirections when accessing any resources, and the performance difference here could be very important. The main target platform of WR today is Windows, and I expect our D3D12 and D3D11 backends to be preferred there. Vulkan doesn't have as wide of a support, and it's worse in terms of interoperation with other Windows-specific systems, e.g. hard-ware media decoding (video playback). So for WR the Vulkan backend would be a solution on Linux and an alternative on some Windows systems.
WebGPU is a higher level API that is tempting to implement and validate in Rust. It would have bigger/fatter structures, so it would benefit from fewer indirections. Ultimately, it would want to use the same native API backend as the browser, which means D3D again on Windows.
Thanks for the detailed reply, I don't have enough experience in this domain to have an opinion on the way forward, just adding my perspective.
Regarding WebRender though, what about Android and iOS? Optimizing for D3D seems illogical to me, in that context.
Android would be Vulkan-as-first-class, but iOS would of course be Metal, and optimizing for D3D would also mean better optimizing for Metal.
On Jul 12, 2018, at 2:15 PM, anderejd notifications@github.com wrote:
Thanks for the detailed reply, I don't have enough experience in this domain to have an opinion on the way forward, just adding my perspective.
Regarding WebRender though, what about Android and iOS? Optimizing for D3D doesn't seem logical to me, in that context.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gfx-rs/gfx/issues/2206#issuecomment-404653188, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2Ln0vJxZZHpgVjyny6RM2iQzJellXmks5uF7x2gaJpZM4VCsJC.
Here is some news: HAL is here to stay. The justification has been mostly provided above ^, plus new ideas by @msiglreith to be expressed and discussed in a separate RFC (codename "BAL". Edit: #2249). Whether or not HAL can some day become stable is also still on the table, but largely outside of the scope of this issue. As for my current complaints about unsound API of HAL, I filed issues #2247 + #2248 and I hope we can fix them.
Ash is the low-level Vulkan binding that we use in the backend: https://github.com/MaikKlein/ash gfx-rs has seen a lot of evolutionary development, a bit too much even. It went from a D3D11-like pre-ll to Vulkan-like HAL, and it did so gradually. Bit by bit we are rewiring HAL to serve the needs of portability, which makes it less safe and looking more like Ash. As we do this, the value of our type sugar drops as well, since it's not backed by the safety guarantees much. Vulkan backend becomes more and more just a straight translation to Ash.
It also becomes fairly clear that HAL is not usable by small teams who don't focus on performance. This is something that wasn't obvious until we made a full turn to VkPortability (and learned about how complex Vulkan really is). Thus, the type sugar we have is going to be hidden/wrapped by a higher-level library, such as Amethyst, GGEZ, or WebGPU in the future.
With that in mind, I'd like to propose a crazy idea of making HAL a drop-in replacement for Ash. Here are the expected results from this grand move:
Concerns required for resolving:
Backend
trait)? Ash doesn't have generic-based APIThoughts? Complaints? This is just an idea on the table so far ;) cc @MaikKlein @msiglreith , and everyone else is welcome to join the discussion!