Exercise Idea: Kafka Printer

sshine commented 5 years ago

This exercise idea is based on a story from my current work place: My boss'es Ubuntu had a bug that made his printer, when asked to print N copies of something, instead print N² copies. So for our Thursday status meeting, he'd hope that the number of people who show up, for whom he would bring a copy of his status report, was targetable with as few print requests as possible.

For example, if 13 people would show up to the meeting, he would first ask the printer of 3 copies (yielding 9 copies), and then 2 copies (yielding 4 copies), with a final result of 3² + 2² = 13 copies. For some numbers of employees, this was simply not possible with two print jobs, and he'd need to consult the printer a third time.

The exercise is to write a function that computes the groan factor, being the minimal number of print jobs it takes to print N copies of a status report when the printer gives you back N² copies.

Examples:

It takes 1 print job to print 1 copy because 1² = 1.
It takes 2 print jobs to print 5 copies: 2² + 1² = 5.
It takes 3 print jobs to print 11 copies: 3² + 1² + 1² = 11.

The suggested exercise name underlines the kafkaesque hope that a specific number of employees N show up on Thursdays such that ∃ a, b ∈ ℕ₀: a² + b² = N for optimal working conditions.

sshine commented 5 years ago

The opposite problem of finding the first N for which at least P print jobs are necessary is also interesting. For example, in order to print 7 copies of a status report, you can't go with

P = 1, since 2² = 4 gives too few copies and 3² = 9 gives too many copies,
P = 2, since 2² + 1² = 5 gives too few copies and 2² + 2² = 8 gives too many copies,
P = 3, since 2² + 1² + 1² = 6 gives too few copies and 2² + 2² + 1² = 9 gives too many copies,

But you can go with P = 4 since 2² + 1² + 1² + 1² = 7.

But one problem at a time.

cmccandless commented 5 years ago

The exercise is to write a function that computes the groan factor, being the minimal number of print jobs it takes to print N copies of a status report when the printer gives you back N² copies.

Love the phrasing here.

kytrinyx commented 5 years ago

groan factor

Groan factor is amazing. Reminds me of the hipmunk travel site's "sort by agony" feature.

That said... we've put this repo in lock down. @sshine My recommendation is to implement this as a track-level exercise wherever you want it, and then once we've re-opened this repo we will have some more high-level thoughts about what would go in here.

ErikSchierboom commented 5 years ago

Love this idea, and the backstory is already done! As Katrina said, the repository is in lock-down, but I'd be happy to help you build this exercise as a track-specific exercise if you need any help.

petertseng commented 5 years ago

Jeez, I had my code go up to 100,000 looking for a number that would take >= 5 print jobs. Then I got suspicious, looked for an explanation, and found https://en.wikipedia.org/wiki/Lagrange%27s_four-square_theorem. Wasn't expecting that.

coriolinus commented 5 years ago

Huh, interesting! In that case, the forward version of the problem ("given the number of attendees, find the number of print jobs") is more interesting than the reverse version of the problem ("given the number of print jobs, find the number of attendees required to require that result").

On Tue, Aug 27, 2019 at 10:14 AM Peter Tseng notifications@github.com wrote:

Jeez, I had my code go up to 100,000 looking for a number that would take

= 5 print jobs. Then I got suspicious, looked for an explanation, and found https://en.wikipedia.org/wiki/Lagrange%27s_four-square_theorem. Wasn't expecting that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/exercism/problem-specifications/issues/1570?email_source=notifications&email_token=AB3V4TSF5BUHM6NKTQELNHTQGTPADA5CNFSM4IL6LCM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5G443Y#issuecomment-525192815, or mute the thread https://github.com/notifications/unsubscribe-auth/AB3V4TXY2E44K43JKK3SZZTQGTPADANCNFSM4IL6LCMQ .

sshine commented 5 years ago

I'd be happy to help you build this exercise as a track-specific exercise

@ErikSchierboom: For what track would you like to do it?

ErikSchierboom commented 5 years ago

@sshine I'm thinking F# and Haskell at first?

sshine commented 5 years ago

@ErikSchierboom: I've written a Haskell solution. Let me know if you'd like to compare them once yours is done or before that. I'm not completely satisfied with mine, so I'd like suggestions for improvement. @petertseng: If you're interested in sharing yours at some point in whatever language you used, feel free to.

ErikSchierboom commented 5 years ago

@sshine Great! I'm not sure how soon I have time to do this though, as I have a vacation coming up :)

sshine commented 5 years ago

Happy holidays then! There's no rush whatsoever.

petertseng commented 5 years ago

I was also unsure of my solution, so I used a few things to help:

Obviously, square numbers require only one job.
https://en.wikipedia.org/wiki/Sum_of_two_squares_theorem tells us which numbers require two print jobs.
https://en.wikipedia.org/wiki/Legendre%27s_three-square_theorem tells us which numbers require three print jobs.
https://oeis.org/A004215 is a handy way to verify the list of numbers that require four print jobs.
We've already discussed https://en.wikipedia.org/wiki/Lagrange%27s_four-square_theorem
https://oeis.org/A002828 can verify the first 106 elements of the list.

The resulting code is in https://gist.github.com/petertseng/c40e34da40c39bdd2eee55e32bc2187a. The original implementation was in Ruby. I ported it to Crystal to get it to be faster and Haskell as well, but the latter two are pretty much blind translations of the original instead of utilising any specific characteristics of the target language. As a result, the Crystal implementation is the fastest, followed by Haskell, and Ruby is the slowest.

sshine commented 5 years ago

My solution in Haskell looks like this.

I refrained from doing prime factorization and finding P = 1 solutions in sub-linear time.

I don't know if this is really a good place for inventing a type class.

The A002828 series was incredibly useful for debugging, thanks! I couldn't find this because I'd forgotten to prefix the series with a 0. We should include a link to this series in the problem description. OEIS does contain sample solutions in Maple and Mathematica. I don't think this should be considered a problem, except perhaps for potential Maple or Mathematica tracks.

Wrt. canonical tests: We should base them off a subset of A002828. Since the function handles P = 1, P = 2 and P = 3 solutions differently, it may be worth to list the tests in an order such that incremental failure first addresses all P = 0 examples, then all P = 1 examples, and so on. I think this will be more TDD.

Proposed test order:

First a gotcha:

N = 0 ⇒ P = 0

Then the first P = 1s:

N = 1 ⇒ P = 1
N = 4 ⇒ P = 1
N = 9 ⇒ P = 1
N = 16 ⇒ P = 1

Then the first P = 2s:

N = 2 ⇒ P = 2
N = 5 ⇒ P = 2
N = 8 ⇒ P = 2
N = 10 ⇒ P = 2

Then the first P = 3s:

N = 3 ⇒ P = 3
N = 6 ⇒ P = 3
N = 11 ⇒ P = 3
N = 12 ⇒ P = 3

Then the first P = 4s:

N = 7 ⇒ P = 4
N = 15 ⇒ P = 4
N = 23 ⇒ P = 4
N = 28 ⇒ P = 4

And lastly a few high ones:

N = 100 ⇒ P = 1
N = 101 ⇒ P = 2
N = 102 ⇒ P = 3
N = 103 ⇒ P = 4

sshine commented 5 years ago

This function is hard to property test. If instead of returning the number of nonzero coefficients, P, it returned a collection of nonzero coefficients, one could verify that there exists a solution with those coefficients, but not easily that there doesn't exist one with fewer (without providing a solution to the problem). As such, it is excellent to test with the current format of canonical tests.

exercism / problem-specifications

Exercise Idea: Kafka Printer #1570