Closed xamgore closed 4 months ago
This is neat, but I'm not sure it's appropriate for this crate as the translation isn't zero cost. It's also unclear to me why we'd assume the bytes are a utf8 string.
How about splitting into safe and unsafe versions? Or give it a clear naming like FmtToUtf8IoAdapter
.
Ok, I've thought about it some more and I believe this approach is unfortunately fundamentally broken. I assume the use case (but please correct me if I'm wrong) is reading from a file of some kind and sending that data over to fmt, maybe like this: io::copy(read_stream, fmt_adapter)
. For ASCII this will work fine as you can splice the bytes however you like, however with UTF-8 if the write
was chopped in the middle of a character then suddenly your string is no longer valid even though the source was. In general, io::Write
is a very poor interface for structured data like this—you need buffering to be correct.
Well, actually I am working with a 3rd-party XML library. It handles only UTF-8 strings, and has two functions for extracting an XML string from an intermediate tree representation. As I want to print the tree (Debug
or Display
) without allocating a String
, the only option I have is to write an adapter, that takes std::io::Write
's input and sends it to a std::fmt::Formatter
.
pub trait XmlWrite {
fn to_writer<W: Write>(&self, writer: &mut XmlWriter<W>) -> XmlResult<()>;
fn to_string(&self) -> XmlResult<String>;
}
pub struct XmlWriter<W: std::io::Write> {
pub inner: W,
}
So the case is, you have some 3rd-party library with not quite a good interface. You want to build something with it just like with LEGO blocks. Without thinking too much or sending PRs and waiting for eternity.
Got it, makes sense. That's a bummer about the library you're using, but one bad interface shouldn't cause us to add another bad one. :) As mentioned above, you'll need a buffer to handle characters that have been chopped in half which isn't appropriate for this crate. Thanks for the suggestion though!
I've checked the std::io::copy
implementation, got the chopping idea. If I rewrite the adapter with a 4-byte buffer, will it be ok solution, or is it still too much? like
self.formatter.write_char(char::from_u32_unchecked(/* 4-byte buffer */))
let str = match from_utf8(buf) {
Ok(_) =>
Err(Utf8Error(ok_up_to_pos)) => unsafe { from_ut8_unchecked(buf[..ok_up_to_pos]) },
/* save buffer buf[up_to_pos..] which is always 1-3 bytes */
Err(_) => panic!(),
}
self.formatter.push_str(str);
PS. Seems like little/big endianness interferes. Ok than, drop it 😄
Yeah, this just requires too many specific conditions to be met.
Hi, nice crate! How about the reverse transformation to complete the lib? Probably, smth like this: