Closed mina86 closed 12 months ago
The crux of the change is in borsh/src/lib.rs. The rest is just s/__private::maybestd/__private/
.
We don't want to turn borsh-rs into maybestd provider.
I mean, borsh devs should have thought about that when they decided to use io::Read, io::Write and io::Error as part of borsh’s public API. For comparison, serde does offer StdError which switches between std::io::Error and its own trait depending on Cargo features. If you don’t want to offer maybestd you have an option of switching the interface to not use std.
Not offering maybestd is not only hostile to users of borsh but also to all users of no_std-compatible libraries which use borsh.
consider using use borsh::__private::maybestd; - it is still public, but from a borsh crate point of view, it is a private implementation that we don't want to provide backward compatibility guaranties.
But you are providing compatibility guarantees to borsh::nostd_io
, correct? So you’re already paying 99% of the cost of supporting borsh::maybestd
. Adding borsh::maybestd::io
as re-export of borsh::nostd_io
or parts of std::io
is a rather tame requirement.
But you are providing compatibility guarantees to borsh::nostd_io, correct?
That scope is small and self-contained in a single file, and that is indeed part of the public interface, as you stated. maybestd
does not need to be part of the public interface, hence the difference.
Not offering maybestd is not only hostile to users of borsh but also to all users of no_std-compatible libraries which use borsh.
no_std-compatible libraries have two valid options:
borsh::__private::maybestd
wherever applicable (borsh-derive does that), knowing that they need to pay attention to __private
changesmaybestd
implementation themselves, as you said, it will be just a re-exportWe optimize borsh for application developers, and we don't want them to abuse borsh::maybestd
as they will later come and request more re-exports there.
That scope is small and self-contained in a single file,
Just like what this PR proposes:
pub mod maybestd {
pub mod io {
#[cfg(feature = "std")]
pub use std::io::{Error, ErrorKind, Read, Result, Write};
#[cfg(not(feature = "std"))]
pub use crate::nostd_io::*;
}
}
Literally five symbols which are part of borsh’s public API. If you don’t like the name maybestd
I’d also be happy with:
#[cfg(feature = "std")]
pub mod io {
pub use std::io::{Error, ErrorKind, Read, Result, Write};
}
#[cfg(not(feature = "std"))]
pub use crate::nostd_io as io;
That would offer types borsh::io::Read
etc.
Treat themselves as "friend" modules (in C++ terms), and use borsh::private::maybestd wherever applicable (borsh-derive does that), knowing that they need to pay attention to private changes
What’s the point of borsh::nostd_io
then? At the moment it’s a foot-gun breaking people’s builds. Consider the following pascal-str
crate:
[package]
name = "pascal-str"
version = "0.1.0"
edition = "2021"
[dependencies]
borsh = { version = "1.0.0-alpha.4", default-features = false }
#![no_std]
extern crate alloc;
struct PascalString(alloc::vec::Vec<u8>);
impl borsh::BorshSerialize for PascalString {
fn serialize<W: borsh::nostd_io::Write>(&self, writer: &mut W) -> borsh::nostd_io::Result<()> {
writer.write_all(&[self.0.len() as u8])?;
writer.write_all(&self.0)
}
}
It compiles and works fine. But let’s now say someone uses it in std application like so:
[package]
name = "foo"
version = "0.1.0"
edition = "2021"
[dependencies]
borsh = "1.0.0-alpha.4"
pascal-str = { path = "../pascal-str" }
And suddenly the code breaks:
error[E0433]: failed to resolve: could not find `nostd_io` in `borsh`
--> /tmp/b/pascal-str/src/lib.rs:7:28
|
7 | fn serialize<W: borsh::nostd_io::Write>(&self, writer: &mut W) -> borsh::nostd_io::Result<()> {
| ^^^^^^^^ could not find `nostd_io` in `borsh`
error[E0433]: failed to resolve: could not find `nostd_io` in `borsh`
--> /tmp/b/pascal-str/src/lib.rs:7:78
|
7 | fn serialize<W: borsh::nostd_io::Write>(&self, writer: &mut W) -> borsh::nostd_io::Result<()> {
| ^^^^^^^^ could not find `nostd_io` in `borsh`
Enabling a cargo feature shouldn’t break builds.
We optimize borsh for application developer
Except you don’t. Because now author of pascal-str
has to introduce a std
feature to their crate and application developer has to enable it. This is an objectively worse experience for application developer than what they had with borsh 0.10.
Write a minimal maybestd implementation themselves, as you said, it will be just a re-export
But as shown, it’s not just a matter of a simple maybestd
module with re-exports. I have no way to refer to Write
, Read
, Error
and ErrorKind
symbols used by borsh. I have to have user of my crate tell me how borsh was compiled.
and we don't want them to abuse borsh::maybestd as they will later come and request more re-exports there.
If they do, tell them to GTFO. However, if something is part of borsh’s public API that should be provided by borsh in a way which doesn’t depend on whether std
feature is enabled or not.
Because now author of
pascal-str
has to introduce astd
feature to their crate and application developer has to enable it. This is an objectively worse experience for application developer than what they had with borsh 0.10.
Well, pascal-str
might want to have std
feature anyway to enable it in borsh
, but I see your point now.
But as shown, it’s not just a matter of a simple
maybestd
module with re-exports. I have no way to refer toWrite
,Read
,Error
andErrorKind
symbols used by borsh. I have to have user of my crate tell me how borsh was compiled.
This is a fair point. @mina86 @dj8yfo @matklad I would like to consider having borsh::maybestd_io
or borsh::io
module to be part of the public API. Still, I don't want to expose the whole maybestd
. What do you think?
Well, pascal-str might want to have std feature anyway to enable it in borsh...
Agree.
I have no way to refer to Write, Read, Error and ErrorKind symbols used by borsh.
Using this recipe https://github.com/near/borsh-rs/blob/09c9295000ced1933e067d094a3cd8fe5507044d/borsh/tests/test_ser_de_with.rs#L22-L26 should help. Doing the dispatch on the client lib side is more explicit about what's happening and i like it more.
like to consider having borsh::maybestd_io or borsh::io module to be part of the public API
This might result in people mindlessly using borsh::io
instead of std::io
even in the spots completely unrelated to borsh (IDE completion somehow adversely affects minds of folks).
Here's an example https://github.com/near/nearcore/pull/9432/files#diff-5610719ce9d54675a9153a04627f8988705312f7dec97375fa702283ed796c1aL4 and there're a few more.
This pr has most likely protected from cases when items unrelated to borsh::io
might be used by only reexporting the relevant ones https://github.com/mina86/borsh-rs/blob/2e942c8c88611804db699237d6b0d976da40d3c4/borsh/src/lib.rs#L127-L130 .
But it doesn't protect from cases when borsh::io
is used in wrong context, where std::io
was meant instead.
Supposedly, the recipe above addresses this too.
We've already adjusted the plan of #51 with making schema
-> unstable__schema
due to additions by @mina86.
Hopefully, he can sustain some backpressure in client libraries too.
std
feature is toggled. This is a bug.Read
, Write
, Result
, Error
and ErrorKind
types used by borsh thus making life of library developers harder. This is bad API design and user-hostile.std
feature which makes life of application developers harder since they now have to enable that feature on libraries which otherwise wouldn’t have it. This is user-hostile and contradicts frol’s statement that borsh is optimised for application developers.The fact that pascal-str
might want a std
feature anyway is not an argument. First of all, there are many possible crates which won’t have std
feature. Specifically the hypothetical pascal-str
crate would probably not have that feature since it would be a wrapper around alloc::string::String
. Second of all, there’s a bug in borsh. I’ve presented steps to reproduce it. And your answer to it: well, just don’t do it. That’s not a valid response to a bug report.
Unless there’s some new convincing argument, I must conclude that borsh 1.x is in state of perpetual brokenness poorly suited for no_std environments and that I should stick to borsh 0.10.
Well, i've found an example pascal_string crate/repo. It doesn't have no_std
at all.
Can you post a link to the problematic repo?
Your example
#![no_std]
extern crate alloc;
struct PascalString(alloc::vec::Vec<u8>);
impl borsh::BorshSerialize for PascalString {
fn serialize<W: borsh::nostd_io::Write>(&self, writer: &mut W) -> borsh::nostd_io::Result<()> {
writer.write_all(&[self.0.len() as u8])?;
writer.write_all(&self.0)
}
}
attempts to assert, that the pascal-str
doesn't use anything out of std
, that's why it can work in both std
and no_std
.
But let’s now say someone uses it in std application like so
Is it supposed to still be using borsh::nostd_io
in an std
application?
If so, it can be helped by removing just this line https://github.com/near/borsh-rs/blob/09c9295000ced1933e067d094a3cd8fe5507044d/borsh/src/lib.rs#L100 , then borsh
will still use either std::io
or self::nostd_io
depending on flag, but provide borsh::nostd_io
outside in both situations.
But then, what is the reason to support this, if pascal-str
won't be compatible with other types from borsh
in an std
application?
Imo, this is what we need (haven't checked this specific code, might need some adjustments):
#[cfg(feature = "std")] use std::io as io_impl;
#[cfg(not(feature = "std"))] mod nostd_io;
#[cfg(not(feature = "std"))] use nostd_io as io_impl;
pub mod io {
pub use crate::io_impl::{Error, ErrorKind, Read, Result, Write};
}
That is:
borsh::io
feels like the right name here. This is not the general "maybestd", io::Error
is the only thing we need here (and we should commit to not needing anything else from the std)pub use
feels like a good way to do that.borsh::io
should be absolutely minimal and strictly restricted to what's needed by borsh, or whether it should be a duck-type replacement for std::io
. Eg, it unclear whether borsh::io
should contain a version of std::io::copy
-- some user code might want to call borsh::io::copy
and have that work in std and non-std. I think it makes sense to expose what's already exposed. That is, neither to be absolutely minimal, nor to try to expose everything. core::io
feels like a common problem which should have solutions in the ecosystem, but quick googling didn't reveal any existing patterns. I think it is plausible that in the distant future, std::io::Error
is moved to core, or some canonical core_io
crate emerges. We should be able to take advantage of those by adding a new feature flag, so there's no compatibility hazard here.
In terms of diff for the current PR, rename&flatten borsh::maybestd::io
to just borsh::io
.
The only downside of using borsh::io
is that inattentive folks can randomly use items from this completion instead of std::io
.
@mina86 has so far ignored the question to the pascal-str
example, which was the cornerstone reason, backing the push for the change done by this pr:
@mina86 has so far ignored the https://github.com/near/borsh-rs/pull/212#issuecomment-1712513408 to the pascal-str example, which was the cornerstone reason, backing the push for the change done by this pr:
Because it wasn’t a cornerstone reason. It was a quick example I came up with on the spot. The cornerstone reason was that there was no consistent way to name types used in the API which in turn resulted in build breaking when std
feature was enabled.
I don’t care if it’s borsh::io
, borsh::maybestd::io
or even borsh::byćmożestd::io
. So long as the same path works regardless of what features are enabled. And once that name is provided, pascal-str
is supposed to use that.
@mina86 well, alternatively, u can use 4 lines to dispatch on an std
feature, or byćmożestd
feature or whatever.
#[cfg(feature = "std")]
use std::io;
#[cfg(not(feature = "std"))]
use borsh::nostd_io as io;
It's explicit and prevents you from importing borsh::io
all over the place, whether you need it or not.
std
is important enough to be worth a line in Cargo.toml
, and, as far as i understood, the pascal-str
did intend to use two different providers of io
, while using borsh
, depending on whether it's std
or not.
5 lines in client with explicit stating of what's used when and what's not vs 500 lines change in the subj repo is a good tradeoff imo.
std is important enough to be worth a line in Cargo.toml
No, it’s not. I’m writing a no_std library. I don’t care about std. I wrote code which works with borsh with default features disabled. And then, when someone uses my library, the code breaks because borsh’s std feature was enabled. That a bug in borsh.
as far as i understood, the pascal-str did intend to use two different providers of io, while using borsh, depending on whether it's std or not.
Yes, depending on whether borsh
is std or not. And since borsh changes types depending on whether it’s std or not, it’s borsh’ responsibility to provide a stable API for types it uses.
Imagine borsh had a uk
feature and when it’s enabled rather than BorshSerialise
and BorshDeserialise
it would define BorshSerialize
and BorshDeserialize
. That would be extremely bad API. Situation with Read
, Write
etc. is the same.
5 lines in client with explicit stating of what's used when and what's not vs 500 lines change in the subj repo is a good tradeoff imo.
I can make this a ‘5 line’ change in borsh if you prefer:
+#[cfg(feature = "std")]
+pub mod io {
+ pub use std::io::{Read, Write, Error, ErrorKind, Result};
+}
+
+#[cfg(not(feature = "std"))]
+pub use nostd_io as io;
I made it 500 to simplify other things in the code. This isn’t a valid argument.
Not to mention that even if this is a 5 lines in client library, it’s 5 lines in every library and additional changes in every crate that transitively depends on it.
When built with
std
feature, borsh uses a handful of std::io types in its public API (e.g. std::io::Write is used in BorshSerialize trait). Whenstd
feature is disabled, borsh switches to its own types which mimics behaviour of standard types offered throughborsh::nostd_io
.Problem is that this approach results in no consistent way to name
Write
,Read
etc. symbols used in the public API. This creates a problem for authors and users of no_std libraries. A no_std library might have code such as:So long as borsh is built with default features disabled it will work correctly. However, if author of a std application enables borsh’s std feature, the aforementioned example crate will fail to build since borsh::nostd_io will no longer be offered.
Address this by introducing
borsh::io
module which exports Error, ErrorKind, Read, Result and Write symbols, i.e. the types which are used in borsh’s public API.With this change, author of a no_std library may write:
and their code will work without problems when used in std applications.