Closed nicholasbishop closed 2 years ago
I've read your whole proposal, and overall I agree with this change (although I also understood the desire for correctness which motivated the initial introduction of Completion
). The only big issue I see is with how the downstream users of the crate will adapt - while a CHANGELOG
file would be a nice start, it will still be a breaking change for almost everyone.
Furthermore, what I'm not sure of is this:
I don't think there's much we can do to anticipate that, but I could imagine getting a vendor-specific bug or two about this.
I don't know how many firmware vendors implement custom status codes, but if I recall correctly there occasionally were people who asked for escape hatches for the Result
/Completion
type, precisely because they were using uefi-rs
with some really weird specific protocols or low-level UEFI code.
I'd love to get some feedback from the osdev community on such a change, which is why I'd advise for waiting for a few days (maybe a week) and seeing if we get any feedback? I'll pin the issue in the meantime.
(although I also understood the desire for correctness which motivated the initial introduction of
Completion
).
Totally agree that the introduction of Completion
made sense! I don't at all think it was a bad choice, just one of those things where the tradeoffs may look different after some time in use.
The only big issue I see is with how the downstream users of the crate will adapt - while a
CHANGELOG
file would be a nice start, it will still be a breaking change for almost everyone.
Perhaps we could temporarily add something near the top of the readme (so it would also show up on crates.io in the next release), e.g. "Important breaking API change when upgrading from version 0.14.0: [...]".
It would be nice to have a smoother upgrade path like when something is just deprecated with a warning, but not sure that would be possible for an invasive change like this.
Furthermore, what I'm not sure of is this:
I don't think there's much we can do to anticipate that, but I could imagine getting a vendor-specific bug or two about this.
I don't know how many firmware vendors implement custom status codes, but if I recall correctly there occasionally were people who asked for escape hatches for the
Result
/Completion
type, precisely because they were usinguefi-rs
with some really weird specific protocols or low-level UEFI code.
Ah interesting. I searched through issues mentioning error/warning/completion/result, but didn't spot anything specific like this. I guess this would primarily be an issue for wrapper functions that make multiple UEFI calls, since anything that just calls a single UEFI call and returns the status as a Result
could presumably be handled on the application side.
I'd love to get some feedback from the osdev community on such a change, which is why I'd advise for waiting for a few days (maybe a week) and seeing if we get any feedback? I'll pin the issue in the meantime.
That sounds good, I'd be happy to wait longer than a week too, since I realize there's a good chance someone with relevant input won't happen to see this issue in a short time span. Maybe we could mention it in the next This Month in Rust OSDev to give more people a chance to see it?
I don't have time for a lot of OSDev anymore, but Completion
was one of the only not-intuitive bits of uefi-rs
I came across when writing my UEFI bootloader. So I'm very much in favour of this change :)
At a high level, my suggestion is that by default we should treat all warnings as errors, and only consider special handling of warnings on a case-by-case basis in functions that wrap UEFI functions where the UEFI spec explicitly mentions a warning might be returned.
This sounds like a really nice way of handling it without compromising correctness, +1
I think this sounds great. I think most of the ergonomics questions then boil down to how ResultExt
will look, and how we make it easy to the "right" thing.
Old definition:
pub trait ResultExt<Output, ErrData: Debug> {
fn status(&self) -> Status;
fn log_warning(self) -> core::result::Result<Output, Error<ErrData>>;
fn unwrap_success(self) -> Output;
fn expect_success(self, msg: &str) -> Output;
fn expect_error(self, msg: &str) -> Error<ErrData>;
fn map_inner<Mapped>(self, f: impl FnOnce(Output) -> Mapped) -> Result<Mapped, ErrData>;
fn discard_errdata(self) -> Result<Output>;
fn warning_as_error(self) -> core::result::Result<Output, Error<ErrData>>
where
ErrData: Default;
}
New Definition:
pub trait ResultExt<Output, ErrData> {
fn status(&self) -> Status;
fn log_warning(self) -> Result<Output, ErrData>
where
ErrData: Into<Output>;
fn ignore_warning(self) -> Result<Output, ErrData>
where
ErrData: Into<Output>;
fn discard_errdata(self) -> Result<Output>;
}
warning_as_error
is now the default, so is no longer needed. map_inner
, unwrap_success
, expect_success
, and expect_error
can now just be the normal map
, unwrap
, expect
, and expect_err
methods for core::result::Result
. We can also get rid of the Debug
constraint.
I'm not sure how annoying the ErrData: Into<Output>
constraint would be in practice. Seems fine as our usual recommendation would be "you almost certainly don't need these methods". We could add a handle_warning
method if we deem it necessary.
Thanks for the feedback!
Re Result
, how about this:
pub trait ResultExt<Output, ErrData> {
fn status(&self) -> Status;
fn discard_errdata(self) -> Result<Output>;
fn warning(&self) -> Option<ErrData>;
}
Instead of log_warning
/ ignore_warning
we just have warning
, in the same spirit as Result::ok
and Result::err
. This avoids needing a ErrData: Into<Output>
constraint. Hopefully it's rare anyway for an application to want to handle a warning, but if they do they can do something like:
if let Some(err_data) = result.warning() {
// Application can decide to log, match on specific warning in `err_data.status()`, etc.
}
w.r.t. the warning()
method, would it make more sense to have a handle_warning
method?
pub trait ResultExt<Output, ErrData> {
fn handle_warning(&self, f: F) -> Result<Output, ErrData>
where
F: impl FnOnce(Error<ErrData>) -> Result<Output, ErrData>;
}
where if the result contains a warning f
determines if we get Ok
or Err
in that case.
For example if we had the following function:
pub fn set_variable(data: &[u8]) -> uefi::Result
and we wanted to log but continue if we got a specific warning, we could then:
// Inside some function that returns a Result
set_variable(&data).handle_warning(|err| {
if err.status() != Status::WARN_RESET_REQUIRED {
return Err(err); // Propagate error
}
// log something about the warning
Ok(()) // Don't return an error
})?;
and if we wanted to ignore all warnings we could:
set_variable(&data).handle_warning(|_| Ok(()))?;
basically this would be a warning-specific version of Result::or_else
.
Also, I noticed that the uefi::Error
type is not public for this crate. It's public in the result
module, but that module is private and only exports {Completion, Result, ResultExt, Status};
.
Is this intentional or a bug (or am I missing something)?
I think handle_warning
sounds like a good idea.
Re. Error
, yeah I think that's an oversight. I'll put up a PR to make that public.
Closing this since we've agreed to accept this change and it got implemented in #361.
Summary
Remove the
Completion
type and changeuefi::Result
's type as follows:Background
The UEFI spec uses the
EFI_STATUS
enum as the return value for most functions. A status can represent one of three things: success, an error, or a warning. The spec currently defines seven warnings. Two more warnings are defined in the platform initialization specification which should not be directly relevant to uefi-rs, and additional warnings could be implemented by OEMs, though I am not sure if any actually do.Here are the currently-defined UEFI warnings from Appendix D:
Details
The current implementation of
uefi::Result
faithfully translatesEFI_STATUS
into a very explicit Rust API. It makes rigorous use of the Rust type system to allow very explicit handling of errors and warnings. However, in practice I think this API is difficult to use well. It's different enough from normal Rust error handling that I always find myself stumbling a bit when figuring out how to use it internally in uefi-rs while implementing wrappers for UEFI functions. And when using the library from an application I tend to just calllog_warning()?
everywhere, which means I'm not doing anything to really handle warnings.Since
uefi::Result
is used pretty much anywhere, it would be great if we could simplify its usage, and I think we can do that while actually making the library more robust.It turns out that warnings are not used very often in the spec. That means while we are paying the mental cost of handling warnings everywhere, they are not actually expected to ever occur outside of a very limited set of functions. The current API theoretically encourages applications to check at each callsite whether they care about warnings, but in reality the answer is almost always "no" simply by virtue of the fact that most functions never return warnings.
At a high level, my suggestion is that by default we should treat all warnings as errors, and only consider special handling of warnings on a case-by-case basis in functions that wrap UEFI functions where the UEFI spec explicitly mentions a warning might be returned. That means that
uefi::Result::Ok
can usually be treated asEFI_SUCCESS
and hence theCompletion
type can be dropped.Here's a breakdown of each individual error showing where it's explicitly referenced in the spec and how I think we should handle it:
EFI_WARN_UNKNOWN_GLYPH
is used in two places: a.EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL.OutputString()
: I think this is the rare case where a warning truly is just a warning. It's akin toString::from_utf8_lossy
, and I think we could handle it in a similar way: provide bothoutput_string
andoutput_string_lossy
methods forOutput
. b.EFI_HII_FONT_PROTOCOL.GetGlyph()
: If this function returns the glyph warning it indicates that the glyph isn't known and info about the0xFFFD
unicode character has been returned instead. This would be better represented as an error.EFI_WARN_DELETE_FAILURE
is used byEFI_FILE_PROTOCOL.Delete()
to indicate that the file wasn't actually deleted, instead the handle was just closed. Yikes! In any normal Rust API (really any normal C API too) that case should be treated as an error.EFI_WARN_WRITE_FAILURE
is not actually referenced anywhere in the spec, but from the description it sure sounds like an opportunity for accidental data loss, so it should be treated as an error.EFI_WARN_BUFFER_TOO_SMALL
is confusingly not the same asEFI_BUFFER_TOO_SMALL
. The warning is used byEFI_STORAGE_SECURITY_COMMAND_PROTOCOL.ReceiveData()
and indicates that some data was written to the buffer, but it wasn't big enough for all of it. That should be treated as an error.EFI_WARN_STALE_DATA
is used by the by the key management service protocol. I'm not very familiar with this part of the spec, but it seems to indicate that a key was used successfully but actually should be replaced. Yikes, that sure sounds like it should be treated as an error.EFI_WARN_FILE_SYSTEM
is used byEFI_LOAD_FILE_PROTOCOL.LoadFile()
specifically in the case of a network boot of an image containing a UEFI file system instead of a UEFI executable. This seems reasonable to treat as a warning rather than an error, so theOk
portion of the result could include a boolean or enum to indicate this situation.EFI_WARN_RESET_REQUIRED
is used bySetVariable()
to indicate that the secure boot setting is transitioning to a less restrictive mode and the firmware requires a reset. This is a very specific and rare case, and it would probably be reasonable just to document that this error can occur and expect that an application dealing with secure boot transitions will handle it appropriately.Concerns
Unknown warnings
What if there are more warnings than we expect? Firmware vendors could do all sorts of weird things, so in theory an existing application that currently treats warnings as non-fatal could start failing on some devices with this change. I don't think there's much we can do to anticipate that, but I could imagine getting a vendor-specific bug or two about this.
Churn
This would be a big change to the API that will almost certainly affect every user of the library. If we make this change we should be sure to document it well, perhaps by starting to maintain a
CHANGELOG.md
file.