BurntSushi / rust-csv

A CSV parser for Rust, with Serde support.
The Unlicense
1.71k stars 219 forks source link

csv::ReaderBuilder::from_reader() moves the file reader. #301

Open ruffianeo opened 1 year ago

ruffianeo commented 1 year ago

Thank you for taking the time to file a bug report. The following describes some guidelines to creating a minimally useful ticket.

Above all else: do not describe your problem, SHOW your problem.

version = "1.2.0"

I have a mixed content scenario in my files:

let mut file_reader = std::io::BufReader::new(f);
  let mut nfields_string = String::new();
  let _ = file_reader
    .read_line(&mut nfields_string)
    .expect("rankings.txt - could not read first line.");
  let mut csv_reader = csv::ReaderBuilder::new()
    .has_headers(false)
    .trim(csv::Trim::Fields)
    .from_reader(file_reader);
  // ... no access to <more-other-stuff> because the file_reader has been moved!
  // ...

This use case is not really supported the way the crate works right now.

The issue also pops up, for example, if you want to dump the remainder of the file into some error log and you use the `Èrror::position()``to seek the file reader and read to end. And given, that I see sporadic csv reader failures in my application (the cvs records are also written with the csv writer of the same crate), it is hard to track down those sporadic failures in a live environment.

Include a complete program demonstrating a problem.

See above for a snippet, showing the inconvenience. The inconvenience is by design. So a working example of how the file_reader is moved is not really helpful.

What is the expected or desired behavior of the code above?

The expected behavior is that the from_reader() does not move the file reader. Or that there is an alternative, so mixed file content scenarios are supported better.

BurntSushi commented 1 year ago

I'm not clear what the issue is here. The csv crate does not require ownership of a reader. It only requires something that implements the Read trait. So you can pass &mut rdr instead of rdr for example. And even if you can't do that for some reason, you can use the into_inner API to get the underlying reader back out.

Also, any csv data written by this crate is definitely going to be parseable by this crate too. If you're getting errors, it's likely due to something else.

The mixed content use case is indeed quite tricky, but I think it's inherently tricky unless you roll your own parser. For example, it might be simpler to use the csv-core crate instead.

ruffianeo commented 1 year ago

Yes, I apologize. Using &mut file_reader actually works. Not the brightest of days I had when I wrote this issue... As for my sporadic corruptions: When the new version of the file got shorter than the previous one, the corruption showed up, because I have to call .truncate(true) on the file I overwrite, too..