Open dimo414 opened 2 years ago
HI @dimo414! Thanks for making an issue! I think that the Example in the README does actually do what you want. It looks like .next_batch()
from linereader returns a slice that is guaranteed end with a newline. the LineBufferReader.fill()
method fills the a buffer and does the same thing. So LineBufferReader.buffer()
will return the same thing as next_buffer()
.
Does that answer the question?
(annotated example)
use grep_cli::stdout;
use ripline::{
line_buffer::{LineBufferBuilder, LineBufferReader},
lines::LineIter,
LineTerminator,
};
use std::{env, error::Error, fs::File, io::Write, path::PathBuf};
use termcolor::ColorChoice;
fn main() -> Result<(), Box<dyn Error>> {
let path = PathBuf::from(env::args().nth(1).expect("Failed to provide input file"));
let mut out = stdout(ColorChoice::Never);
let reader = File::open(&path)?;
let terminator = LineTerminator::byte(b'\n');
let mut line_buffer = LineBufferBuilder::new().build();
let mut lb_reader = LineBufferReader::new(reader, &mut line_buffer);
while lb_reader.fill()? {
// It's the lb_reader.buffer() here that returns &[u8]. LineIter is just an optimized zero-copy iterator for iterating over those lines. You could use the returned buffer in the same way you were using the buffer from `next_batch`
let lines = LineIter::new(terminator.as_byte(), lb_reader.buffer());
for line in lines {
out.write_all(line)?;
}
lb_reader.consume_all();
}
Ok(())
}
Thanks for the help! I've migrated over to ripline and LineBufferReader
is working as you describe :) I was confused because the example uses a LineTerminator
that is only referenced by LineIter
, so it wasn't clear that the LineBufferReader
was similarly EOL-aware.
Some thoughts:
LineTerminator
in the example is distracting, since it's basically just holding onto the b'\n'
until it gets passed directly to LineIter
(vs. passing in terminator
itself). It would be better to use it in both places or neither.
LineIter
instances should be provided by the LineBufferReader
so that they can share the same terminator, rather than risking them being out of sync.LineTerminator
type is minimally-used in the library itself, it only shows up in ripline::lines::without_terminator()
- maybe it would be cleaner to just get rid of that type?LineBufferReader.fill()
are somewhat confusing, since it sounds like false
is returned as soon as EOF is reached in the underlying source, but actually it returns false
only once the buffer is fully drained.fill()
and consume_all()
; linereader does both operations via next_batch()
/next_line()
.
I'd like to swap from linereader to this library due to the silently capped line lengths issue you mention in the README, but it's not obvious to me how to replicate that function using the APIs in ripline. It'd be helpful if you could add an example of processing batches of lines at a time. Thanks!