c3d / db48x

RPL runtime for the DM42 calculator, in the spirit of HP48/49/50
http://48calc.org
GNU Lesser General Public License v3.0
85 stars 10 forks source link

Crash with factorial divided by factors #70

Closed c3d closed 1 year ago

c3d commented 1 year ago

Computing a factorial of 25, then dividing by its factors successively leads to

  1. A scenario where we have a fraction instead of a whole number
  2. A crash in the garbage collector

This was found testing the C++ interface for number processing, but was shown to exist on the stable branch, so not a regression.

c3d commented 1 year ago

This seems to have disappeared after various fixes on the processing of bignums, but keeping it open because there was no fix specifically for the generation of a malformed object or a crash in the garbage collector.

c3d commented 1 year ago

Got the crash again after implementing a stress test for big fractions that looks like this:

    step("Manual computation of 997/100!");
    test(CLEAR, 997, ENTER);
    for(uint i = 1; i <= 100; i++)
        test(i * 997 % 101, DIV);
    expect("997/"
           "9332621544394415268169923885626670049071596826438162146859296389521"
           "7599993229915608941463976156518286253697920827223758251185210916864"
           "0000000000000000000000");

The crash had the following recorder dump, which seems to indicate that a garbage collection ran into trouble during the creation of a bignum of size 486 bytes, with 363 bytes available.

[31086174 91.659522] gc_details: Scanning object 0x10deca4b6 (ends at 0x10deca4c8)
[31086175 91.659523] gc_details: Recycling 0x10deca4b6 size 18 total 1089
[31086176 91.659523] gc_details: Scanning object 0x10deca4c8 (ends at 0x10deca4cb)
[31086177 91.659523] gc_details: Recycling 0x10deca4c8 size 3 total 1092
[31086178 91.659523] gc_details: Scanning object 0x10deca4cb (ends at 0x10deca4ea)
[31086179 91.659524] gc_details: Recycling 0x10deca4cb size 31 total 1123
[31086180 91.659524] gc_details: Scanning object 0x10deca4ea (ends at 0x10deca4ed)
[31086181 91.659525] gc_details: Recycling 0x10deca4ea size 3 total 1126
[31086182 91.659525] gc_details: Scanning object 0x10deca4ed (ends at 0x10deca51e)
[31086183 91.659525] gc_details: Recycling 0x10deca4ed size 49 total 1175
[31086184 91.659526] gc_details: Scanning object 0x10deca51e (ends at 0x10deca521)
[31086185 91.659526] gc_details: Found 0x10deca51e in GC-safe pointer 0x10deca521 (0x16b8ca5b0)
[31086186 91.659526] gc_details: Moving 0x10deca51e-0x10deca521 to 0x10deca087
[31086187 91.659526] gc_details: Adjustment range is 0x10deca51e-0x10deca521, no scratch
[31086188 91.659526] gc_details: Scanning object 0x10deca521 (ends at 0x10deca576)
[31086189 91.659526] gc_details: Found 0x10deca521 in GC-safe pointer 0x10deca576 (0x16b8ca590)
[31086190 91.659526] gc_details: Moving 0x10deca521-0x10deca576 to 0x10deca08a
[31086191 91.659526] gc_details: Adjustment range is 0x10deca521-0x10deca576, no scratch
[31086192 91.659527] gc_details: Adjusting GC-safe 0x16b8ca5b0 from 0x10deca521 to 0x10deca08a
[31086193 91.659527] gc_details: Adjusting GC-safe 0x16b8ca618 from 0x10deca521 to 0x10deca08a
[31086194 91.659527] gc_details: Adjusting GC-safe 0x16b8ca7b8 from 0x10deca521 to 0x10deca08a
[31086195 91.659527] gc_details: Scanning object 0x10deca576 (ends at 0x10deca579)
[31086196 91.659527] gc_details: Found 0x10deca576 in GC-safe pointer 0x10deca579 (0x16b8ca498)
[31086197 91.659527] gc_details: Moving 0x10deca576-0x10deca579 to 0x10deca0df
[31086198 91.659527] gc_details: Adjustment range is 0x10deca576-0x10deca579, no scratch
[31086199 91.659527] gc_details: Adjusting GC-safe 0x16b8ca590 from 0x10deca576 to 0x10deca0df
[31086200 91.659527] gc_details: Adjusting GC-safe 0x16b8ca5f8 from 0x10deca576 to 0x10deca0df
[31086201 91.659527] gc_details: Adjusting GC-safe 0x16b8ca638 from 0x10deca576 to 0x10deca0df
[31086202 91.659528] gc_details: Adjusting GC-safe 0x16b8ca7a8 from 0x10deca576 to 0x10deca0df
[31086203 91.659528] gc_details: Adjustment range is 0x10deca579-0x10deca668, scratch
[31086204 91.659528] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca5cc to 0x10deca135
[31086205 91.659528] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca579 to 0x10deca0e2
[31086206 91.659529] gc: Garbage collection done, purged 1175, available 1314
[31086207 91.659529] gc_details: Adjustment range is 0x10deca0e2-0x10deca668, scratch
[31086208 91.659529] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca135 to 0x10deca1c8
[31086209 91.659529] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca0e2 to 0x10deca175
[31086210 91.659530] runtime: Initializing object 0x10deca175 type 19 bignum size 3
[31086211 91.659531] gc_details: Adjustment range is 0x10deca175-0x10deca668, scratch
[31086212 91.659531] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca176 to 0x10deca179
[31086213 91.659531] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca175 to 0x10deca178
[31086214 91.660294] keys: Push key 0 (wr 5289 rd 5289)
[31086215 91.660295] sim_keys: Key 0 coords (0, 0, 0, 0)
[31086216 91.660777] runtime: Initializing object 0x10deca178 type 19 bignum size 205
[31086217 91.660777] gc_details: Adjustment range is 0x10deca178-0x10deca668, scratch
[31086218 91.660778] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca208 to 0x10deca2d5
[31086219 91.660778] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca178 to 0x10deca245
[31086220 91.660779] runtime: Initializing object 0x10deca245 type 19 bignum size 3
[31086221 91.660779] gc_details: Adjustment range is 0x10deca245-0x10deca668, scratch
[31086222 91.660779] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca246 to 0x10deca249
[31086223 91.660779] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca245 to 0x10deca248
[31086224 91.663164] runtime: Initializing object 0x10deca248 type 19 bignum size 338
[31086225 91.663164] gc_details: Adjustment range is 0x10deca248-0x10deca668, scratch
[31086226 91.663164] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca312 to 0x10deca464
[31086227 91.663164] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca248 to 0x10deca39a
[31086228 91.663165] runtime: Initializing object 0x10deca39a type 19 bignum size 3
[31086229 91.663165] gc_details: Adjustment range is 0x10deca39a-0x10deca668, scratch
[31086230 91.663166] gc_details: Adjusting GC-safe 0x16b8ca488 from 0x10deca39b to 0x10deca39e
[31086231 91.663166] gc_details: Adjusting GC-safe 0x16b8ca498 from 0x10deca39a to 0x10deca39d
[31086232 91.669417] runtime: Initializing object 0x10deca39d type 19 bignum size 486
[31086233 91.669417] gc: Garbage collection, available 363, range 0x10dec9f3f-0x10deca39d
[31086234 91.669459] recorder_signals: Received signal 0x134ecee80, dumping recorder
[31086235 91.669459] recorder_signals: Received signal Segmentation fault: 11 (11) si_addr=0x40942d4da65ebc9, ucontext 0x11621bc18, dumping recorder
[31086236 91.669461] recorder: Recorder dump

Note that I may be hitting the limits of the kind of test I can do with 2K of memory. Still, I should get an Out of memory error, not a crash.

c3d commented 1 year ago

The stack trace looks like this:

==43800==The signal is caused by a READ memory access.
    #0 0x1048062f4 in unsigned short leb128<unsigned short, unsigned char>(unsigned char*&) leb128.h:64
    #1 0x104842bd8 in runtime::integrity_test(object const*, object const*, object const**, object const**) runtime.cc:185
    #2 0x104842110 in runtime::gc() runtime.cc:303
    #3 0x104841fe4 in runtime::available(unsigned long) runtime.cc:146
    #4 0x104838d20 in bignum* runtime::make<bignum, runtime::gcp<unsigned char const>, unsigned long>(bignum::id, runtime::gcp<unsigned char const> const&, unsigned long const&) runtime.h:773
    #5 0x1048395e4 in bignum::quorem(runtime::gcp<bignum const>, runtime::gcp<bignum const>, object::id, runtime::gcp<bignum const>*, runtime::gcp<bignum const>*) bignum.cc:620
    #6 0x104839888 in operator%(runtime::gcp<bignum const>, runtime::gcp<bignum const>) bignum.cc:650
    #7 0x10483bdb4 in gcd(runtime::gcp<bignum const>, runtime::gcp<bignum const>) fraction.cc:197
    #8 0x10483b888 in big_fraction::make(runtime::gcp<bignum const>, runtime::gcp<bignum const>) fraction.cc:209
    #9 0x10483cc04 in operator/(runtime::gcp<fraction const>, runtime::gcp<fraction const>) fraction.cc:282
    #10 0x10484d978 in div::fraction_ok(runtime::gcp<fraction const>&, runtime::gcp<fraction const>&) arithmetic.cc:408
    #11 0x10484b57c in arithmetic::evaluate(object::id, runtime::gcp<algebraic const>&, runtime::gcp<algebraic const>&, void (*)(BID_UINT128*, BID_UINT128*, BID_UINT128*), void (*)(unsigned long long*, unsigned long long*, unsigned long long*), void (*)(unsigned int*, unsigned int*, unsigned int*), bool (*)(object::id&, object::id&, unsigned long long&, unsigned long long&), bool (*)(runtime::gcp<bignum const>&, runtime::gcp<bignum const>&), bool (*)(runtime::gcp<fraction const>&, runtime::gcp<fraction const>&), bool (*)(runtime::gcp<algebraic const>&, runtime::gcp<algebraic const>&, object::id&, object::id&)) arithmetic.cc:691
    #12 0x10484c078 in arithmetic::evaluate(object::id, void (*)(BID_UINT128*, BID_UINT128*, BID_UINT128*), void (*)(unsigned long long*, unsigned long long*, unsigned long long*), void (*)(unsigned int*, unsigned int*, unsigned int*), bool (*)(object::id&, object::id&, unsigned long long&, unsigned long long&), bool (*)(runtime::gcp<bignum const>&, runtime::gcp<bignum const>&), bool (*)(runtime::gcp<fraction const>&, runtime::gcp<fraction const>&), bool (*)(runtime::gcp<algebraic const>&, runtime::gcp<algebraic const>&, object::id&, object::id&)) arithmetic.cc:794
    #13 0x10484d428 in object::result arithmetic::evaluate<div>() arithmetic.cc:834
    #14 0x104827a60 in div::do_evaluate(div const*) arithmetic.h:151
    #15 0x10482b134 in object::evaluate() const object.h:302
    #16 0x1048271c4 in object::do_execute(object const*) object.cc:258
    #17 0x1048141a4 in object::execute() const object.h:312
    #18 0x10481b860 in user_interface::handle_functions(int) user_interface.cc:2552
    #19 0x1048199ec in user_interface::key(int, bool) user_interface.cc:328
    #20 0x104815210 in handle_key(int, bool) main.cc:131
    #21 0x104814ff0 in program_main main.cc:313
    #22 0x10480cb98 in RPLThread::run() sim-rpl.cpp:60
    #23 0x107788b3c in QThreadPrivate::start(void*)+0x148 (QtCore:arm64+0x180b3c) (BuildId: f0f3e1d7cb2a38ccb5b0bc3691c722f332000000200000000100000000000d00)
    #24 0x1ac483fa4 in _pthread_start+0x90 (libsystem_pthread.dylib:arm64e+0x6fa4) (BuildId: 46d35233a0513f4fbba4ba56dddc4d1a32000000200000000100000000040d00)
    #25 0x1ac47ed9c in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x1d9c) (BuildId: 46d35233a0513f4fbba4ba56dddc4d1a32000000200000000100000000040d00)

The integrity test is failing and crashes trying to read what looks like a bad pointer 0x40942d4da65ebc9.

c3d commented 1 year ago

Running in the debugger causes a similar crash, where the first level of the stack is actually corrupt:

  5:  014: Manual computation of 997/100!                              Process 46332 stopped
* thread #6, name = 'QThread', stop reason = EXC_BAD_ACCESS (code=1, address=0xf7475546c088da7e)
    frame #0: 0x000000010001a2f4 simulator`unsigned short leb128<unsigned short, unsigned char>(unsigned char*&) + 16 at /Users/ddd/Work/calc/db48x/src/leb128.h:64
   61   //   Return the leb128 value at pointer
   62   // ----------------------------------------------------------------------------
   63   {
-> 64       if (bp[0] < 0x80)
                ^
   65           return *bp++;
   66       uint16_t b1 = *bp++ & 0x7F;
   67       uint16_t b2 = *bp++ << 7;
Target 0: (simulator) stopped.
(lldb) up
up
frame #1: 0x000000010001a2a4 simulator`object::type() const + 36 at /Users/ddd/Work/calc/db48x/src/object.h:227
   224      // ------------------------------------------------------------------------
   225      {
   226          byte *ptr = (byte *) this;
-> 227          id ty = (id) leb128<uint16_t>(ptr);
                             ^
   228          if (ty > NUM_IDS)
   229          {
   230              object_error(ty, this);
(lldb)

frame #2: 0x0000000100056bdc simulator`runtime::integrity_test(object const*, object const*, object const**, object const**) + 260 at /Users/ddd/Work/calc/db48x/src/runtime.cc:185
   182          return false;
   183
   184      for (object_p *s = stack; s < stackEnd; s++)
-> 185          if (!*s || (*s)->type() >= object::NUM_IDS)
                                 ^
   186              return false;
   187
   188      return true;
(lldb) p s
p s
(object_p *) $0 = 0x00000001091d1e68
(lldb) p *s
p *s
(object_p) $1 = 0xf7475546c088da7e
(lldb) p stack
p stack
(object_p *) $2 = 0x00000001091d1e68
(lldb) p stackEnd
p stackEnd
(object_p *) $3 = 0x00000001091d1e80

In the quorem function, rs is 502. The rg pointer is 0x1091d1d90:

(lldb) p rs
(size_t) $5 = 502
(lldb) x rg.safe
0x1091d1d90: 8a aa 32 53 bd b3 b9 52 2b e0 ad 52 e1 53 d2 59  ..2S...R+..R.S.Y
0x1091d1da0: 04 f8 f9 f2 71 d2 53 4e 13 87 f9 b1 f8 c0 91 86  ....q.SN........

and the distance to the stack is only 216.

(lldb) p (byte *)$0-(byte *)rg.safe
(long) $9 = 216

Clearly, that does not fit. So my rg pointer scribbled all over the stack.

It looks like the computation of needed was wrong in that case:

(lldb) p needed
p needed
(size_t) $10 = 317

The size of x and y are 2 and 315 respectively:

(lldb) p xs
(size_t) $11 = 2
(lldb) p ys
(size_t) $12 = 315

So why is the remainder so big?

c3d commented 1 year ago

Given the test that we are running, i.e. 997 divided by successive integers, the expectation would be that the denominator (xg) would be big, not the numerator. But here the denominator is 2 bytes long and the numerator is 315. That's unexpected.

c3d commented 1 year ago

The problem was in the loop computing the remainder of the long division

                 for (uint ri = 0; ri < rs; ri++)
                 {
                     c = remainder[ri] - x[ri] - c;
                     remainder[ri] = byte(c);
                     c >>= 8;
                 }

There were two problems with this loop:

  1. When we enter the loop, it's possible that the remainder is one byte larger than x, because of the carry. It's simpler to allocate one extra byte and consider x[ri] as zero beyond xs.

  2. As written, the carry is negative, so we need to take the opposite.