EbTech / rust-algorithms

Common data structures and algorithms in Rust
MIT License
3.77k stars 220 forks source link

Reading lots of integers #11

Closed koutoftimer closed 4 years ago

koutoftimer commented 4 years ago

Reading 2e5 integers from the file.

Your solution with unsafe gives times:

> cargo run
0.380072
> cargo run
0.328795
> cargo run
0.320145

Handmade parser over byte stream:

> cargo run
0.14436
> cargo run
0.130388
> cargo run
0.13056
code with test case attached ```rust use std::fs; use std::io::{self, prelude::*}; use std::time::Instant; use std::error::Error; struct Parser { it: std::vec::IntoIter, } impl Parser { fn new(bytes: Vec) -> Parser { Parser { it: bytes.into_iter() } } } impl Iterator for Parser { type Item = i32; fn next(&mut self) -> Option { let mut ans = 0; let mut sign = 1; let mut started = false; while let Some(b) = self.it.next() { if b >= b'0' && b <= b'9' { started = true; ans = (b - b'0') as i32 + ans * 10; } else if b == b'-' { sign = -1; } else if started == true { break; } } if started { Some(ans * sign) } else { None } } } fn read() -> Result<(), Box> { let mut reader = io::BufReader::new(fs::File::open("stdin.txt")?); let mut bytes: Vec = Vec::new(); reader.read_to_end(&mut bytes)?; let mut parser = Parser::new(bytes); let n = parser.next().ok_or("N")? as usize; let mut v: Vec = Vec::with_capacity(n); let mut empty = 0; for token in parser { if token == -1 { empty += 1; } else { let token = token - 1; v.push(token); } } Ok(()) } fn main() { let instant = Instant::now(); if let Err(error) = read() { eprintln!("{:?}", error); } println!("{}", instant.elapsed().as_micros() as f64 / 1e6); } ``` [stdin.txt](https://github.com/EbTech/rust-algorithms/files/4782171/stdin.txt)

.parse() is an evil. Handmade solution faster in comparison to C++ with -O2 using std::scanf and reading into an array.

0.171025
0.15101
0.149998
c++ code ```c++ #include #include template struct measure { template static double execution(F&& func, Args&&... args) { auto start = std::chrono::steady_clock::now(); std::forward(func)(std::forward(args)...); auto duration = std::chrono::duration_cast< TimeT> (std::chrono::steady_clock::now() - start); return double(duration.count()) / 1e6; } }; void solve() { std::freopen("stdin", "r", stdin); std::freopen("stdout", "w", stdout); int n, empty = 0; std::scanf("%d", &n); int v[size_t(2e5)]{-1}; for (size_t i = 0, p; i < n; ++i) { std::scanf("%d", v + i); } } int main () { std::cerr << measure<>::execution(solve) << std::endl; } ```
EbTech commented 4 years ago

Thanks, that's quite fast!

I like to have a generic parse() that applies to a variety of types. On integer types, it redirects to from_str_radix(). Maybe all the checked adds and multiplies are what's slowing it down?

I think contest problems generally can handle the overhead, but traits and macros could be used to defer certain types if you'd like the speed of handcoded implementations. A standard library solution just feels a bit safer to me. Even UnsafeScanner's only assumption is that the input be valid UTF-8, which is true for all programming contests of which I'm aware; in fact, they only use the ASCII subset.

koutoftimer commented 4 years ago

I agree that it is better to stay generic, but I almost lost my hope in this approach because of Problem B on Codeforces.

Writing output is even slower than reading input data. Datastructures like Vec<i32>, even made via ::with_capacity constructor are too much slow..

It took me 8 tried to complete solution. Despite it was good playgroud, I'm still a bit disapointed that Rust with opt-level=0 is SO SLOW with I\O and all kinds of checks (like overflows for each numeric operation).

What I discovered:

  1. Using std is pretty much always faster than handmade calculations with for loops because of additional checks made for opt-level=0.
  2. Using Vec data structure is slower then regular arrays.

Do you know some tips how to stay generic and do not loose performance on opt-leve=0 ? IMHO, asking rust community sounds stupid, because they can asnwer something like "we have opt-level=3 for speed optimized builds" (which is really damn fast).

UPD: Note: I haven't mentioned, that when I talk about "regular array" I mean stack allocated. Heap allocation for large vectors takes a lot of time in comparison to arrays on stack.

EbTech commented 4 years ago

Wow 1000 ms to 78 ms! That's... surprising. I might try this problem in the morning. I thought Codeforces uses opt level 2 with Rust, same as with C++. Were your first tests with O0 as well? I've used Vecs without issue so far, though I haven't run any comparisons of my own...

EbTech commented 4 years ago

Your solution with UnsafeScanner seems to run fine, taking 62 ms on the judge. The safe version should work too. The Vec-only solution did take twice as long and substantially more memory, I'm not sure why.

Btw, I think only users who've solved the problem or enabled coach mode are able to access these Gym links!

EbTech commented 4 years ago

Apparently, Codeforces invokes rustc like this. -O should be the same as optimization level 2.

EbTech commented 4 years ago

The example I gave for your Problem B seemed fast enough; do you have a programming contest example where I/O speed was an issue?

koutoftimer commented 4 years ago

@EbTech looks like this thread makes no sense because 1 sec execution time was just a lag.

I'll experiment more later. I'm on chapter 15 of The Book right now. I want to try macros approach and modify scanner in iterator fashion (for the case if intput is one long line).

I think only users who've solved the problem or enabled coach mode are able to access these Gym links!

I'm not able to.

EbTech commented 4 years ago

Sure thing! The benchmarks you provided may be useful later; you might edit your original comment to say which optimization level you used there. Good luck!

Edit: here's the submission that ran in 62 ms, if you couldn't access the CF links.