jilljenn / tryalgo

Algorithms and data structures for preparing programming competitions: basic and advanced
https://tryalgo.org
MIT License
371 stars 106 forks source link

Misleading comment in the book for Rabin Karp algorithm #52

Closed theflofly closed 5 years ago

theflofly commented 5 years ago

Hello,

About the line 16 of the Rabin Karp algorithm:

val = (old_val - out_digit * last_pos + DOMAIN * PRIME) % PRIME

You say in the book that you are adding DOMAIN * PRIME to counter the fact that C++ will return a negative modulo.

Even by doing this you can obtain a negative modulo as (old_val - out_digit * last_pos + DOMAIN * PRIME) can be negative, this happen if the sum produces a number bigger than 2^64.

What do you think?

Also, I am not sure if it is a real issue when the number ends up negative?

xtof-durr commented 5 years ago

Hello Florian Courtial,

yes good point, DOMAINPRIME should be small enough such that the sum stays within 63 bits. So that's why we choose PRIME less than 2^56. This way DOMAINPRIME is less than 2^63. old_val is less than 2^53. In fact,

PRIME + DOMAIN * PRIME > 2^63.

Does not sound good, but since both terms in the sum are less than 2^63, the sum is less than 2^64, so the sign bit is not set in C++, and this number remains positive. We are good.

But if the number becomes negative we are in big trouble I think. Because https://stackoverflow.com/questions/11720656/modulo-operation-with-negative-numbers/42131603 in C++ the modulo of a negative number is negative, and that is not what we want.

Christoph

theflofly commented 5 years ago

I see, indeed it shouldn't be possible to have a number > 2^64.

You wrote old_val is less than 2^53, did you mean 2^56?

I don't see from where the 53 is coming?

xtof-durr commented 5 years ago

Yes it should be 2^{56}. But I cannot find where we wrote 53. Maybe in an old version of the code or of the book.

theflofly commented 5 years ago

I was referencing your answer to my initial comment.

Thanks for answering.