modula3 / cm3

Critical Mass Modula-3
http://modula3.github.io/cm3/
Other
133 stars 26 forks source link

Bug in RegEx.m3 ? #1044

Open mikanystrom-intel opened 2 years ago

mikanystrom-intel commented 2 years ago

I think I found a bug in RegEx.m3. Can someone please confirm?

If you use the "start" feature, the code can overflow on line 576:

574       ELSE
575         str_idx := start;
576         str_max := MIN(start + len, textlen)
577       END;

because the default value of len is LAST(CARDINAL):

PROCEDURE Execute (READONLY pat : Pattern; data : TEXT; start: CARDINAL := 0; len : CARDINAL := LAST(CARDINAL); mem : REF Memory := NIL ): INTEGER;

start + len wraps and goes negative.

str_max is a CARDINAL so we get a checked runtime error (range error) on line 576.

I find the behavior surprising, undocumented, and wrong. Am I right?

VictorMiasnikov commented 2 years ago

As "first impression":

} I find the behavior surprising, undocumented

IMHO, it is surprising, but ( unfortunately) documented. ;-(

} overflow on line 576

Possible solution:

VAR Diff_S_TL := textlen - start ;

And later compare Diff_S_TL with len

VictorMiasnikov commented 2 years ago

P.S. And we always can use "long arithmetic"

( L.A. has be already implemented in cm3 Modula-3 "ecosystem" )

ghost commented 2 years ago

This we called broken by design :man_facepalming: Only because it's documented and you could get away with it? Why don't think about fix it to be more sane? :-1:

RodneyBates commented 2 years ago

What is the surprise? From 2.2.3:

"CARDINAL Behaves just like the subrange [0..LAST(INTEGER)]"

So a negative value is out of range.

That "+" wraps is perhaps the one surprise.

I couldn't find it right away, but somewhere, the designers stated that they were leaving it to an implementation whether to trap integer overflow, probably to appease those who believe micro-efficiency should trump definedness. They pretty much followed the opposite principle most other places.

Note that Word.Plus is defined to wrap, if you know you want that.

mikanystrom commented 2 years ago

I'm not saying the compiler or runtime is buggy.

What I'm saying is that RegEx.m3 is buggy because it overflows on "reasonable" input.

Mika

On Sun, Sep 4, 2022, 8:06 PM Rodney M. Bates @.***> wrote:

What is the surprise? From 2.2.3:

"CARDINAL Behaves just like the subrange [0..LAST(INTEGER)]"

So a negative value is out of range.

That "+" wraps is perhaps the one surprise.

I couldn't find it right away, but somewhere, the designers stated that they were leaving it to an implementation whether to trap integer overflow, probably to appease those who believe micro-efficiency should trump definedness. They pretty much followed the opposite principle most other places.

Note that Word.Plus is defined to wrap, if you know you want that.

— Reply to this email directly, view it on GitHub https://github.com/modula3/cm3/issues/1044#issuecomment-1236389372, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKYNJMBYWXH7R74DEZEP73V4TQKJANCNFSM57VANWCQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

VictorMiasnikov commented 2 years ago

I'm not saying the compiler or runtime is buggy. What I'm saying is that RegEx.m3 is buggy because it overflows on "reasonable" input.

How about "small refactoring" -) ?

VVM} Possible solution:
VVM} VAR Diff_S_TL := textlen - start ;
VVM} And later compare Diff_S_TL with len
RodneyBates commented 2 years ago

Ah, missed that.  Sorry.

On 9/4/22 15:24, Mika Nyström wrote:

I'm not saying the compiler or runtime is buggy.

What I'm saying is that RegEx.m3 is buggy because it overflows on "reasonable" input.

Mika

On Sun, Sep 4, 2022, 8:06 PM Rodney M. Bates @.***> wrote:

What is the surprise? From 2.2.3:

"CARDINAL Behaves just like the subrange [0..LAST(INTEGER)]"

So a negative value is out of range.

That "+" wraps is perhaps the one surprise.

I couldn't find it right away, but somewhere, the designers stated that they were leaving it to an implementation whether to trap integer overflow, probably to appease those who believe micro-efficiency should trump definedness. They pretty much followed the opposite principle most other places.

Note that Word.Plus is defined to wrap, if you know you want that.

— Reply to this email directly, view it on GitHub https://github.com/modula3/cm3/issues/1044#issuecomment-1236389372, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKYNJMBYWXH7R74DEZEP73V4TQKJANCNFSM57VANWCQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/modula3/cm3/issues/1044#issuecomment-1236409218, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABSVZNEA2IBFU5EWYG2RYDLV4UAOHANCNFSM57VANWCQ. You are receiving this because you commented.Message ID: @.***>