enum/error types are ambiguous

mozilla / uniffi-rs

a multi-language bindings generator for rust

https://mozilla.github.io/uniffi-rs/

Mozilla Public License 2.0

2.68k stars 219 forks source link

enum/error types are ambiguous #2034

Closed arg0d closed 4 months ago

arg0d commented 5 months ago

https://github.com/mozilla/uniffi-rs/pull/1590 puts code generators into an awkard position, https://github.com/NordSecurity/uniffi-bindgen-go/pull/48. error type is removed, but its not clear what the final use cases for enum-error-object should be. Can errors be referenced as regular types, i.e. function arguments, return values, struct members? If error type does not exist, then a reference to error may be ambiguous depending on bindings implementation, and the aforementioned use case is not possible.

If we look at generated Go code, error and enum types are not interchangeable. enum is represented in a straightfoward enum manner, whereas error is implemented to support various Go error paradigms.

UDL

enum Error {
  "Network",
  "Disk",
};

Go enum

type RequestError uint

const (
    RequestErrorNetwork RequestError = 1
    RequestErrorDisk    RequestError = 2
)

Go error

type RequestError struct
func (err RequestError) Error() string
func (err RequestError) Unwrap() error

type RequestErrorNetwork struct { message string }
func (err RequestErrorNetwork) Error() string
func (self RequestErrorNetwork) Is(target error) bool

type RequestErrorDisk struct
func (err RequestErrorDisk) Error() string
func (self RequestErrorDisk) Is(target error) bool

Given these 2 distinct generate types, a reference to RequestError becomes ambiguous. Does RequestError refer to "error" or "enum" implementation?

UDL

namespace example {
    void do_something_with_error(RequestError error);
};

Go

func DoSomethingWithError(RequestError error);

mhammond commented 5 months ago

The builtin bindings all generate subtly different code for enums and errors. Whether a name is treated as an error or an enum is determined by whether it's actually used as an error. Kotlin, for example, uses a different base class in each scenario.

Given these 2 distinct generate types, a reference to RequestError becomes ambiguous. Does RequestError refer to "error" or "enum" implementation?

For the existing bindings, this would not be an error - it's just a param. For it to be considered an error, it would need to be used as a throws. Kotlin has the same basic problem - if that was the entirety of the UDL, the RequestError generated by Kotlin would not derive from Exception - but if there was a single function which used it as an error it would.

How often does it happen that you have an object you want treated as an error even though it never appears as an error?

arg0d commented 5 months ago

Whether a name is treated as an error or an enum is determined by whether it's actually used as an error

I understand this part. What I'm wondering is if there are plans to support using an enum both as "enum" and "error" in the same uniffi package.

mhammond commented 5 months ago

What I'm wondering is if there are plans to support using an enum both as "enum" and "error" in the same uniffi package.

Well it kinda is - these types are represented as enums, but are used as errors. Even before #1590, errors were represented as enums plus a kind of flag to indicate whether it was used as an error or not - which is roughly the same situation as now, except that flag is inferred.

Maybe I'm confused by the reference to #1590 - I guess I don't understand how that changed the fundamental properties here other than the possibility that an error isn't automatically flagged.

mhammond commented 5 months ago

To follow up further, errors can now also be interfaces, etc - so an object is capable of being both an "object" and "error" in the same package, in the same way enums are. In other words, the fact a type is an error is in addition to the type of the object. Many bindings need to use this flag to change the fundamental characteristics of the underlying type (ie, using a different base class, for example), but the objects are still generally able to be used as both.

arg0d commented 5 months ago

My confusion rises from your saying that errors are implemented as enums. In Kotlin, "flat" enums are implemented as enum class, but "flat" errors are implemented as class hierarchies. So clearly typeof enum != typeof error. Its the same case as with Go. Are there plans to support distinct enum and error types in generated bindings in the same uniffi package, or is the current implenenation of having either an enum or an error type gonna stay?

mhammond commented 5 months ago

My confusion rises from your saying that errors are implemented as enums.

Right - what I mean is that in the "component interface", most errors are represented as Type::Enum { .. } - this was effectively true even when Type::Error existed - they were serialized in exactly the same way, have the same "variant"s layout, etc.

However, for all bindings, errors are implemented differently than enums - ie, all bindings have EnumTemplate and ErrorTemplate - so each binding is free to generate a different type when an enum is used as an error, as you note above.

Part of the justification for #1590 was to allow errors to also be represented as Type::Object which now is true. There's no reason errors can't be any other type (eg, a plain string, an integer, a record, etc) even though there are no current plans to actually do that.

Are there plans to support distinct enum and error types in generated bindings in the same uniffi package, or is the current implenenation of having either an enum or an error type gonna stay?

I'm still struggling with that distinction, because IIUC all bindings generate different things for enums vs errors - as you note above, Kotlin generates different types for errors vs enums, so it effectively treats them as different types.

But the fact errors can now also be interfaces means the current implementation is likely to remain, but we are obviously open to making the lives easier for bindings where we can.

Given your example of Kotlin above, maybe it might help if we talk in terms of Kotlin? What would it mean to 'support using an enum both as "enum" and "error" in the same uniffi package.' in Kotlin terms?

arg0d commented 5 months ago

#[derive(Debug, thiserror::Error, uniffi::Error)]
pub enum RequestError {
    #[error("Network")]
    Network,
    #[error("Disk")]
    Disk,
}

#[uniffi::export]
pub fn produce_error() -> Result<(), RequestError> {
    Ok(())
}

#[uniffi::export]
pub fn consume_error(error: RequestError) {}

Right now the above proc macros generate the following bindings.

sealed class RequestException: Exception() {
    class Network() : RequestException() {
        override val message
            get() = ""
    }

    class Disk() : RequestException() {
        override val message
            get() = ""
    }

    companion object ErrorHandler : CallStatusErrorHandler<RequestException> {
        override fun lift(error_buf: RustBuffer.ByValue): RequestException = FfiConverterTypeRequestError.lift(error_buf)
    }
}

fun `consumeError`(`error`: RequestException) = ..

@Throws(RequestException::class)
fun `produceError`() = ..

When I'm talking about distinct types, I mean that RequestError enum would be used as a regular enum for function parameters, but at the same time used as an exception for error handling.

enum class RequestError {  
    NETWORK,
    DISK;
    companion object
}

sealed class RequestException: Exception() {
    class Network() : RequestException() {
        override val message
            get() = ""
    }

    class Disk() : RequestException() {
        override val message
            get() = ""
    }

    companion object ErrorHandler : CallStatusErrorHandler<RequestException> {
        override fun lift(error_buf: RustBuffer.ByValue): RequestException = FfiConverterTypeRequestError.lift(error_buf)
    }
}

fun `consumeError`(`error`: RequestError) = ..

@Throws(RequestException::class)
fun `produceError`() = ..

arg0d commented 5 months ago

I guess the actual reason I'm bringing up this discussion, is because the removal of dedicated Error type from UDL syntax tree creates a lot confusion. From my understanding, enum and error types are still distinct types in proc macros and in UDL. An enum defined in proc macros and UDL is always either an enum, or an error. So I'm quite confused why UDL syntax tree coalesces both enum and error types into singular type enum. With this loss of information, it becomes awkward to refer to enum. For every usage of enum, the code must be ammended to additionally check for ci.is_name_used_as_error(..). For example, if we take a look at Go, enum and error have different type signatures. Referring to an enum should be done using RequestError, but referring to an error should be done using *RequestError. This means that its not possible to emit type signature using type information alone. ComponentInterface must be consulted whenever emitting enum type.

mhammond commented 5 months ago

For every usage of enum, the code must be ammended to additionally check for ci.is_name_used_as_error(..)

I do see how this isn't ideal - but apart from convenience I still don't see how this is fundamentally different than before #1590. For example, part of the inconvenience could be addressed by changing the CI - eg, pub fn enum_definitions(&self) could change to only return items not used as an error, and new pub fn error_enum_definitions(&self)/pub fn error_object_definitions(&self) returns things that are.

The other inconvenience of checking is_name_used_as_error() when dealing with params etc isn't as clear - previously your templates or Rust code would have had something like:

{% when Type::Enum ... %} // or if in rust a plain old match
...
{% when Type::Error ... %}
...

whereas now you need:

{% when Type::Enum ... %}
{% if ci.is_name_used_as_error(name) %}
...
{% else %}
...

but it still seems like the fundamental capabilities are the same, just how the checks are made are different. When taking "interfaces as errors" into account, every Type::Error object would still need an additional check for whether the error is an enum or an object. So while I can see how you might prefer one way of spelling things over the other (and indeed, we might be able to come up with convenience helpers so you can choose), the number of checks that need to be made still seem identical in both cases.

In other words, I still fail to see what fundamental differences there are between now and #1590 - I see how the checks that need to be made are different, but the total number of final end-states seems the same to me. And assuming "interfaces as errors" is a good thing, something had to change here - you'd either need to say "here's an error object, is it an interface or enum?" or "here's an enum/interface object, is it an error?"

I'm sorry for being slow and I hope it's clear I'm not disagreeing with you, I'm just struggling to understand the actual problem (other than the fact that errors can now be a number of types means dealing with errors is less convenient)

arg0d commented 4 months ago

Thanks for the discussion. If enums/errors in UDL (and proc macros) remains as "distinct" types, i.e. an enum is either an enum or an error, but not both, then there is no ambiguity. Closing this issue.