Closed samoylovfp closed 7 months ago
While a good step in the direction of a simpler version FritteX14's solution doesn't solve all the concerns in this issue, it is still using BufReader
. I'd also argue it's barely less efficient because it still avoids reading the contents of the file into memory. Further because it explicitly states it's more efficient it confuses some new users on Stack Overflow possible more. I'd say labeling the original version as "efficient" is kind of a mislabel here.
The current examples are still confusing for newcomers, especially because both version use io::BufReader::new(file).lines()
and have essentially the same efficiency. The comment This process is more efficient than creating a String in memory especially working with larger files
seems to no longer apply to the current examples. I think some changes would help:
.unwrap()
-style code is not the subject of these example and is just adding to confusion.Vec<String>
.AsRef<Path>
, BuffReader
, and returning an iterator.read_until()
and reusing the same line buffer, maybe even a Vec<u8>
buffer that only gets promoted to &str
based on some criteria.The comment
This process is more efficient than creating a String in memory especially working with larger files
seems to no longer apply to the current examples.
I think you are basing you feedback on the rust-by-example book, if you look at the source you see that comment still applies.
Use the same idiomatic error handling in all examples, as converting from
.unwrap()
-style code is not the subject of these example and is just adding to confusion.
Good point, a look around this repo show most file examples use match to handle errors for example:
let mut file = match File::create(&path) {
Err(why) => panic!("couldn't create {}: {}", display, why),
Ok(file) => file,
};
It might be better to match that style. However rust by example is not only for beginners replacing unwrap with match makes it harder to copy the code and build on it. Using ?
would be even better for copying and make it even less understandable for newcomers. Maybe keep unwrap in the more efficient case and add a note with a link to the explanation in the book?
Keep the second mostly example as-is, with comment-level explanations about using
AsRef<Path>
,BuffReader
, and returning an iterator.
Most of my non library code does not use AsRef
, as its usually not needed. It makes the example more complicated and might push readers to overuse generics. The other file examples do not make use of it.
Add a third example using
read_until()
and reusing the same line buffer, maybe even aVec<u8>
buffer that only gets promoted to&str
based on some criteria.
IMO if we add a third example centered around performance it should clearly note that this degrades readability and is usually the wrong approach. We might also want to proof it actually improves performance using a benchmark. At which point it get to complicated for rust-by-example
.
It would fit the rust performance book. A link to that might be nice here.
I think you are basing you feedback on the rust-by-example book, if you look at the source you see that comment still applies.
Ah indeed I was looking at doc.rust-lang.org, the git version makes a lot more sense. Is it a matter of waiting for the next deployment on d.r-l.o ?
It might be better to match that style. However rust by example is not only for beginners replacing unwrap with match makes it harder to copy the code and build on it. Using ? would be even better for copying and make it even less understandable for newcomers. Maybe keep unwrap in the more efficient case and add a note with a link to the explanation in the book?
Not sure what is the best style to adopt, there are good arguments for unwrap
, match
and ?
. I'd lean toward ?
to encourage best practices, but the main thing it to remain consistent, at least within this example and hopefully throughout the book.
Most of my non library code does not use AsRef, as its usually not needed. ... IMO if we add a third example centered around performance it should clearly note that this degrades readability and is usually the wrong approach.
I think both AsRef and read_until make sense in a third example with a foreword that they might be overkill and that the second example is often good enough. But they're both pretty classic tweaks for flexible API (AsRef) and speedier parsing (read_until).
Delaying the from_utf8()
call is indeed a lot of complexity for YMMV gains, so we can leave it out.
Hi, I'm closing this issue just to clean up the items that already exist. If you think this issue makes sense to stay open, please create a new issue with updated information and mention this one.
thanks!
Hello! First of all thanks for the amazing resource!
The read_lines example features an optimized way of reading lines from a file with a
BufReader
.The code is efficient but is pretty involved and makes a newcomer stumble, forcing them to read about
Result
s,BufReader
s,AsRef
and generics with awhere
clause.My new-to-rust friend ran into this when trying to solve challenges from Advent of Code.
Most challenges from AoC require reading a sequence of lines from an input file and I feel it would be good to also have a less efficient example that exposes less new concepts to a newcomer, because this is the first thing you run into when trying to solve an AoC challenge.
Example
or
I am still not happy about having to mention
&str
toString
conversion but I feel like even though these examples are longer they apply less cognitive pressure to someone not familiar with rust