exercism / rust

Exercism exercises in Rust.
https://exercism.org/tracks/rust
MIT License
1.48k stars 521 forks source link

palindrome-products: instructions don't match function signature #1517

Closed sbromberger closed 1 year ago

sbromberger commented 2 years ago

Specifically:

If the largest or smallest palindrome has more than one pair of factors within the range, then return all the pairs.

But the signature of the relevant function is

pub fn palindrome_products(min: u64, max: u64) -> Option<(Palindrome, Palindrome)>

Which appears to be a single pair of the smallest and largest products.

sbromberger commented 2 years ago

(also, the combinatorial approach seems to be necessary here since you have to order by product, but trying to do this on a (1000, 9999) range takes way too long.)

coriolinus commented 2 years ago

It looks like it's my fault: it was initially written such that the signature matched (in #501), but then I changed it in #847 and apparently neglected to update the instructions.

Should be a straightforward fix to update those.

sbromberger commented 2 years ago

I gotta say, this was a very frustrating exercise. It continues to be, because the O(n^2) nature of the algorithm means that the tests won't consistently pass. Could we possibly redo the tests so that we're not doing 81 million checks, or just scrap this exercise altogether? It's not really teaching much in the way of rust fundamentals.

coriolinus commented 2 years ago

Testing locally is relatively painless, as long as you think to test in release mode. Locally, using the example solution, I get

$ cargo clean && time cargo test --release -- --include-ignored
Sun 26 Jun 2022 01:53:02 AM CEST
   Compiling palindrome-products v1.2.0 (/home/coriolinus/projects/exercism/xrust/exercises/practice/palindrome-products)
    Finished release [optimized] target(s) in 0.65s
     Running unittests src/lib.rs (target/release/deps/palindrome_products-4b8a48f302615400)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/palindrome-products.rs (target/release/deps/palindrome_products-7fa68973ef893fa6)

running 14 tests
test test_empty_result_for_smallest_if_no_palindrome_in_the_range ... ok
test test_error_result_for_largest_if_min_is_more_than_max ... ok
test test_empty_result_for_largest_if_no_palindrome_in_the_range ... ok
test test_error_result_for_smallest_if_min_is_more_than_max ... ok
test test_find_the_largest_palindrome_from_double_digit_factors ... ok
test test_finds_the_largest_palindrome_from_single_digit_factors ... ok
test test_finds_the_smallest_palindrome_from_single_digit_factors ... ok
test test_palindrome_new_return_none ... ok
test test_palindrome_new_return_some ... ok
test test_find_the_smallest_palindrome_from_double_digit_factors ... ok
test test_find_smallest_palindrome_from_triple_digit_factors ... ok
test test_find_the_largest_palindrome_from_triple_digit_factors ... ok
test test_find_the_largest_palindrome_from_four_digit_factors ... ok
test test_find_smallest_palindrome_from_four_digit_factors ... ok

test result: ok. 14 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.00s

   Doc-tests palindrome-products

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

real    0m1.771s
user    0m2.836s
sys     0m0.226s

However, that's long enough that the test runner (which is quite likely allocated significantly fewer resources than are available on my local machine) is likely to take more than 2 seconds to compile and test this exercise.

I am opposed to just scrapping the exercise: it's a practice exercise, outside of the critical path on this track. if the way to get skillful at a language is to write a bunch of code in that language, then it's best to have a large number of exercises, even if those exercises don't necessarily directly teach any concepts. The point of practice is to get experience, not always to do something new.

I can understand the motivation to avoid the large N problems. However, we don't always get to avoid large N in real life. Frustration at slow execution is a prime motivator for the development of new and more efficient algorithms. Given these axioms, it seems wise to include some large N problems in the problem set, so that students get experience improving a solution which is fine for small N but does not scale.

sbromberger commented 2 years ago

Here's my output. FWIW, I was unaware of the requirement for the custom testing command (it's not in the docs anywhere; in fact, the HELP.md file just references cargo test, which is what I had been doing and what had been causing local testing timeouts.) Here's what following the instructions in HELP.md yielded on my local system:

running 13 tests
test test_empty_result_for_largest_if_no_palindrome_in_the_range ... ok
test test_error_result_for_largest_if_min_is_more_than_max ... ok
test test_empty_result_for_smallest_if_no_palindrome_in_the_range ... ok
test test_error_result_for_smallest_if_min_is_more_than_max ... ok
test test_finds_the_largest_palindrome_from_single_digit_factors ... ok
test test_finds_the_smallest_palindrome_from_single_digit_factors ... ok
test test_palindrome_new_return_none ... ok
test test_find_the_smallest_palindrome_from_double_digit_factors ... ok
test test_find_the_largest_palindrome_from_double_digit_factors ... ok
test test_find_smallest_palindrome_from_triple_digit_factors ... ok
test test_find_the_largest_palindrome_from_triple_digit_factors ... ok
test test_find_smallest_palindrome_from_four_digit_factors has been running for over 60 seconds
test test_find_the_largest_palindrome_from_four_digit_factors has been running for over 60 seconds
test test_find_smallest_palindrome_from_four_digit_factors ... ok
test test_find_the_largest_palindrome_from_four_digit_factors ... ok

test result: ok. 13 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out; finished in 140.56s

Using the custom test options you used, I get:

seth@hydrogen:~/dev/rust/exercism/palindrome-products $ cargo clean && time cargo test --release -- --include-ignored
...
cargo test --release -- --include-ignored  18.71s user 0.43s system 185% cpu 10.290 total

Both of these exceed the runner's time limits, so there is no way to get passing tests on the exercism website, and therefore no way to get a "green check".

I can understand the motivation to avoid the large N problems. However, we don't always get to avoid large N in real life. Frustration at slow execution is a prime motivator for the development of new and more efficient algorithms.

As someone who works on graphs, I'm certainly exposed to N^2 (and even N^3 on occasion!) in real life. However, there is no possible complexity improvement in the algorithm here; it necessarily has to be N^2 because of the need to sort the products of elements from two independent ranges. Therefore, there is no productive challenge here, only a futile one that will frustrate would-be attempters.

There is also nothing new to be learned from this exercise, regardless of whether it's "just a practice" or "on the track" - a distinction of which I was unaware in any case (and I've been using exercism for a while now). It was just one of the next exercises suggested to me in the list.

Finally, an argument ad populum - I note that the most-starred community example is one that 1) does not solve the problem, and 2) inveighs strongly against its inclusion in the track. After I submitted my (failing) iteration, I decided to see whether anyone had managed to implement anything that might pass the runner tests. I didn't see ANY examples.

Please reconsider the value of this particular exercise to the overall objective. It shouldn't be to frustrate users with correct code (that can't be further optimized) that nevertheless won't pass the tests on the website.

At the least, restrict the ranges in the test file so that it can pass the tests and won't take 2+ minutes locally without some undocumented testing syntax.

also ref #1497

coriolinus commented 2 years ago

For clarity, is your timing result using the example solution, or some other solution? It's not impossible, but my intuition suggests that it's unlikely that my dev machine is literally 10x as fast as yours. If, as seems more likely, the major performance difference is in the code under test, then it should be possible to make up substantial portions of that time difference by improving the code. If that does in fact happen, then I'd argue that you will have learned something valuable about programming in Rust, specifically via the large N test case in this exercise.

Keep in mind that there can be real, significant performance differences in two O(n**2) implementations; we often pretend otherwise, but the constant factors really can matter.

That's at the end of a chain of assumptions, though; I don't want to get too far ahead of myself.

sbromberger commented 2 years ago

For clarity, is your timing result using the example solution, or some other solution?

No, it was using my own first attempt. Using the example solution yielded a standard cargo test time of 32s, which is significantly faster. (My code doesn't use inline directives on the compare, and collects the products instead of mutating min and max variables since I'm practicing a functional approach.)

If the only way to succeed in this exercise is to use an imperative approach that keeps track of min and max, perhaps that advice should be offered in the instructions. It could also be an opportunity to learn about the #[inline] directive, which would at least confer some educational benefit to this otherwise pointless exercise.

(Sorry - yes, I'm frustrated. The juice here was definitely not worth the squeeze, and I am now having to question the benefit of continuing to do any more of these exercises that don't actually make me better at Rust.)

Edited to add: I note that even the example solution fails the tests on the website. Is there really no solution that will pass the current test suite?

petertseng commented 2 years ago

My below comment is only meant to answer the question "is there a solution that will pass the current test suite?" - my comment makes no judgment on the value of the exercise or the lack thereof.

If I understand correctly, solution https://exercism.org/tracks/rust/exercises/palindrome-products/solutions/petertseng passes the current test suite (I submitted it minutes before writing this, and the results screen shows the below ("14 tests passed"))

14passed

coriolinus commented 2 years ago

If the only way to succeed in this exercise is to use an imperative approach that keeps track of min and max, perhaps that advice should be offered in the instructions. It could also be an opportunity to learn about the #[inline] directive

I'm hesitant to say that it's the only way to succeed, because I bet there's a very fast .fold-based approach, but it's certainly a relatively straightforward way to approach this problem. Anyway, documentation-improvement PRs are always welcome.

The juice here was definitely not worth the squeeze, and I am now having to question the benefit of continuing to do any more of these exercises that don't actually make me better at Rust.

It's unfortunate that you feel that way. If you feel that you aren't learning, then by all means stop; you're (presumably) an adult who can choose to make the best use of your time.

iHiD commented 2 years ago

At the least, restrict the ranges in the test file so that it can pass the tests and won't take 2+ minutes locally without some undocumented testing syntax.

Could this syntax be added to the README? Is it for this exercise or all of them? Is this the syntax we use on the online test runner?

Presuming that syntax can be used on the test runner, how long would @sbromberger's code take to run there?

coriolinus commented 2 years ago

Could this syntax be added to the README?

Depends on the syntax in question:

The --release flag is already mentioned in the track README, but apparently not directly in the exercise documentation. That's something we could improve.

I'm not sure how long his solution takes to run in release mode; I haven't seen the code.

sbromberger commented 2 years ago

I'm not sure how long his solution takes to run in release mode; I haven't seen the code.

18.71 seconds

ErikSchierboom commented 2 years ago

I don't recall offhand if the test runner has been updated to respect it

There is an open PR that you've reviewed: https://github.com/exercism/rust-test-runner/pull/52/files Maybe we should try to get that merged?

targrub commented 2 years ago

My initial solution runs locally in ~27s in debug, ~2.7s in release (for which no instructions were given [spoiler: cargo test -r]). The test runner on the server must also be running in debug, since my solution goes over time there. The problem description says nothing about optimization nor running in release, and it does not hint at any tricks provided or necessary to successfully complete the exercise. If such additional work is necessary, perhaps it should instead be the focus for an advanced version of this exercise.

senekor commented 1 year ago

Closing, because the test runner was apparently improved to respect test-in-release-mode. There are plenty of passing community solutions, including mine, which doesn't do any crazy optimizations.