sstadick / ripline

Fast by-line reader from ripgrep
The Unlicense
12 stars 0 forks source link

Add example of swapping from linereader's next_batch() method #1

Open dimo414 opened 2 years ago

dimo414 commented 2 years ago

I'd like to swap from linereader to this library due to the silently capped line lengths issue you mention in the README, but it's not obvious to me how to replicate that function using the APIs in ripline. It'd be helpful if you could add an example of processing batches of lines at a time. Thanks!

sstadick commented 2 years ago

HI @dimo414! Thanks for making an issue! I think that the Example in the README does actually do what you want. It looks like .next_batch() from linereader returns a slice that is guaranteed end with a newline. the LineBufferReader.fill() method fills the a buffer and does the same thing. So LineBufferReader.buffer() will return the same thing as next_buffer().

Does that answer the question?

(annotated example)

use grep_cli::stdout;
use ripline::{
    line_buffer::{LineBufferBuilder, LineBufferReader},
    lines::LineIter,
    LineTerminator,
};
use std::{env, error::Error, fs::File, io::Write, path::PathBuf};
use termcolor::ColorChoice;

fn main() -> Result<(), Box<dyn Error>> {
    let path = PathBuf::from(env::args().nth(1).expect("Failed to provide input file"));

    let mut out = stdout(ColorChoice::Never);

    let reader = File::open(&path)?;
    let terminator = LineTerminator::byte(b'\n');
    let mut line_buffer = LineBufferBuilder::new().build();
    let mut lb_reader = LineBufferReader::new(reader, &mut line_buffer);

    while lb_reader.fill()? {
       // It's the lb_reader.buffer() here that returns &[u8]. LineIter is just an optimized zero-copy iterator for iterating over those lines. You could use the returned buffer in the same way you were using the buffer from `next_batch`
        let lines = LineIter::new(terminator.as_byte(), lb_reader.buffer());
        for line in lines {
            out.write_all(line)?;
        }
        lb_reader.consume_all();
    }

    Ok(())
}
dimo414 commented 2 years ago

Thanks for the help! I've migrated over to ripline and LineBufferReader is working as you describe :) I was confused because the example uses a LineTerminator that is only referenced by LineIter, so it wasn't clear that the LineBufferReader was similarly EOL-aware.

Some thoughts: