SUPERCILEX / io-adapters

Adapters to convert between different writable APIs.
https://crates.io/crates/io-adapters
Apache License 2.0
21 stars 1 forks source link

fmt::Write <- io::Write adapter #1

Closed xamgore closed 4 months ago

xamgore commented 5 months ago

Hi, nice crate! How about the reverse transformation to complete the lib? Probably, smth like this:

use std::fmt::Formatter;
use std::io::{Error, ErrorKind, Result, Write};
use std::str::from_utf8;

pub struct FmtToIoAdapter<'a, 'b> {
  formatter: &'a mut std::fmt::Formatter<'b>,
}

impl<'a, 'b> From<&'a mut std::fmt::Formatter<'b>> for FmtToIoAdapter<'a, 'b> {
  fn from(formatter: &'a mut Formatter<'b>) -> Self {
    FmtToIoAdapter { formatter }
  }
}

impl Write for FmtToIoAdapter<'_, '_> {
  fn write(&mut self, buf: &[u8]) -> Result<usize> {
    let str = from_utf8(buf).map_err(|e| Error::new(ErrorKind::InvalidData, e))?;
    self.formatter.write_str(str).map_err(|e| Error::new(ErrorKind::Other, e))?;
    Ok(buf.len())
  }

  fn flush(&mut self) -> Result<()> {
    Ok(())
  }
}
SUPERCILEX commented 5 months ago

This is neat, but I'm not sure it's appropriate for this crate as the translation isn't zero cost. It's also unclear to me why we'd assume the bytes are a utf8 string.

xamgore commented 5 months ago

How about splitting into safe and unsafe versions? Or give it a clear naming like FmtToUtf8IoAdapter.

SUPERCILEX commented 4 months ago

Ok, I've thought about it some more and I believe this approach is unfortunately fundamentally broken. I assume the use case (but please correct me if I'm wrong) is reading from a file of some kind and sending that data over to fmt, maybe like this: io::copy(read_stream, fmt_adapter). For ASCII this will work fine as you can splice the bytes however you like, however with UTF-8 if the write was chopped in the middle of a character then suddenly your string is no longer valid even though the source was. In general, io::Write is a very poor interface for structured data like this—you need buffering to be correct.

xamgore commented 4 months ago

Well, actually I am working with a 3rd-party XML library. It handles only UTF-8 strings, and has two functions for extracting an XML string from an intermediate tree representation. As I want to print the tree (Debug or Display) without allocating a String, the only option I have is to write an adapter, that takes std::io::Write's input and sends it to a std::fmt::Formatter.

pub trait XmlWrite {
    fn to_writer<W: Write>(&self, writer: &mut XmlWriter<W>) -> XmlResult<()>;
    fn to_string(&self) -> XmlResult<String>;
}

pub struct XmlWriter<W: std::io::Write> {
    pub inner: W,
}

So the case is, you have some 3rd-party library with not quite a good interface. You want to build something with it just like with LEGO blocks. Without thinking too much or sending PRs and waiting for eternity.

SUPERCILEX commented 4 months ago

Got it, makes sense. That's a bummer about the library you're using, but one bad interface shouldn't cause us to add another bad one. :) As mentioned above, you'll need a buffer to handle characters that have been chopped in half which isn't appropriate for this crate. Thanks for the suggestion though!

xamgore commented 4 months ago

I've checked the std::io::copy implementation, got the chopping idea. If I rewrite the adapter with a 4-byte buffer, will it be ok solution, or is it still too much? like

self.formatter.write_char(char::from_u32_unchecked(/* 4-byte buffer */))
let str = match from_utf8(buf) {
    Ok(_) => 
    Err(Utf8Error(ok_up_to_pos)) => unsafe { from_ut8_unchecked(buf[..ok_up_to_pos]) },
                                 /* save buffer buf[up_to_pos..] which is always 1-3 bytes */
    Err(_) => panic!(),
}
self.formatter.push_str(str);

PS. Seems like little/big endianness interferes. Ok than, drop it 😄

SUPERCILEX commented 4 months ago

Yeah, this just requires too many specific conditions to be met.