BurntSushi / encoding_rs_io

Streaming I/O adapters for the encoding_rs crate.
Other
25 stars 6 forks source link

Implement BufRead #8

Closed dralley closed 2 years ago

dralley commented 2 years ago

I might be missing something, but it seems like DecodeReaderBytes contains an internal buffer for transcoded bytes

    /// The internal buffer to store transcoded bytes before they are read by callers.
    buf: B,

as well as a position in that buffer where reads start from, as well as an end of the transcoded bytes

    /// The current position in `buf`. Subsequent reads start here.
    pos: usize,
    /// The number of transcoded bytes in `buf`. Subsequent reads end here.
    buflen: usize,

Shouldn't it be possible to trivially implement BufRead on top of that without a needing external buffer (and hence, extra copying)?

dralley commented 2 years ago

I think those comments are incorrect. It appears that the buf within the struct itself does not, in fact, store transcoded bytes, it just stores bytes straight from the file. Decoding / transcoding happens in the Read implementation when data is copied into the user-provided buffer.

However, despite not providing a BufRead interface, am I correct in saying that it fulfills one of the primary purposes of using a BufReader, which is avoiding the syscall overhead of repeated reads? So if you don't need the BufRead interface, you don't need to wrap a BufReader around the file?

BurntSushi commented 2 years ago

You're correct that this cannot implement BufRead on its own.

I believe you are also correct that the way this library is implemented, in generally means you don't need to pass a buffered reader to a DecodeReaderBytes. You might want to wrap a DecodeReaderBytes itself in a BufReader though, to avoid the overhead of potentially calling the transcoding function a bunch of times on small inputs.

Some or all of this probably should be added to the docs.

dralley commented 2 years ago

Addressing the documentation in my open PR.