technoblogy / ulisp-arm

A version of the Lisp programming language for ARM-based boards.
http://www.ulisp.com/
MIT License
97 stars 30 forks source link

GC bug? #28

Closed rmu75 closed 3 years ago

rmu75 commented 3 years ago

I'm not 100% sure, but I think I found a potential problem around https://github.com/technoblogy/ulisp-arm/blob/master/ulisp-arm.ino#L4916.

If a GC is triggered in the val in that line, very strange things happen. The "form" is put into the freelist, and because some pointers to it are held, part of the freelist ends up in the args. I'm not 100% sure this applies to the stock version, but on my heavily hacked-on test I can trigger it with running this http://www.ulisp.com/list?21S2

technoblogy commented 3 years ago

Thanks for reporting this. If you can reproduce the same behaviour with the release version I will definitely investigate it. Alternatively, email me your modified version and I'll have a look at it.

rmu75 commented 3 years ago

I will try to reproduce it, but at the moment I don't have any arduino-supported hardware nearby.

In my version (a port to some self-cooked microcontroller framework which will eventually get it's own github repository), the fix is to push() the form onto the GCStack before the eval + appropriate pop()s.

technoblogy commented 3 years ago

I had those sorts of problems with the GC when developing uLisp, but have never seen any problems like that with the release version. If you can provide some more information I'd be interested.

rmu75 commented 3 years ago

Ok, I managed to reproduce it with the release version on a STM nucleo L476RG with the following defines:

#elif defined (ARDUINO_NUCLEO_L476RG)
  #define WORKSPACESIZE 2000              /* Objects (8*bytes) */
  #define SYMBOLTABLESIZE 512             /* Bytes */
  #define CODESIZE 64                     /* Bytes */
  #define STACKDIFF 320
  #define Serial1 Serial

and LispLibrary as

      "(defun sq (x) (* x x))\n"
      "(defun tak (x y z)\n"
      "  (if (not (< y x))\n"
      "    z\n"
      "    (tak\n"
      "     (tak (1- x) y z)\n"
      "     (tak (1- y) z x)\n"
      "     (tak (1- z) x y))))\n"
      "(defun fib (n)\n"
      "  (if (< n 3) 1\n"
      "    (+ (fib (- n 1)) (fib (- n 2)))))\n"
      "\n"
      "(defvar ers 0)\n"
// rest from http://www.ulisp.com/list?21S2

Result:

uLisp 3.3 
Error: 'mapcan' result is not a proper list: 4
1780> 

The type of error and where it happens depends on available free space.

rmu75 commented 3 years ago

I forgot to verify my proposed fix, will follow up in a couple of hours.

rmu75 commented 3 years ago

Just another FYI: it doesn't happen when feeding the testsuite on the REPL because the REPL garbage collects after each entered form, so the gc is never called form https://github.com/technoblogy/ulisp-arm/blob/master/ulisp-arm.ino#L4834

technoblogy commented 3 years ago

So in your first post, when you said you could trigger the problem with the test suite, had you commented out the GC in the REPL?

rmu75 commented 3 years ago

No, i put the testsuite as big string in code and ran it like the library.

rmu75 commented 3 years ago

I will make a fork with a complete self-contained test-case and a PR with the fix in the evening (local time, 8 hours or so).

technoblogy commented 3 years ago

Great! This is really helpful.

rmu75 commented 3 years ago

sorry, my comment https://github.com/technoblogy/ulisp-arm/issues/28#issuecomment-722034734 is wrong, the mapcan-thing is something else.

this finally demonstrates the bug: https://github.com/rmu75/ulisp-arm/tree/gc-bug-demonstration, it results in my test in this:

; ...... many lines starting with (aeq omitted ......
(aeq (quote cdadr) (quote nil) (cdadr nil))
(nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil)
Error: 'aeq' too many arguments

This fixes it i believe: https://github.com/rmu75/ulisp-arm/tree/gc-bug-fix. It results also in a strange result:

(aeq (quote read-from-string) 123 (read-from-string "123"))
un
Error: undefined: un

but I have no clue how that can happen, and I can't debug this in the arduino IDE.

PR has to wait.

technoblogy commented 3 years ago

Thanks - I see what you mean:

....
(aeq (quote cdadr) (quote nil) (cdadr nil))
(nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil)
Error: 'aeq' too many arguments

What I'm puzzled about is why I haven't noticed problems like this before, running other uLisp programs. Anyway, I'll investigate...

rmu75 commented 3 years ago

I think the circumstances to trigger this are very specific. I didn't think it through completely, but i suppose it can only happen in this kind of "load a library" context when a GC is triggered at the wrong moment.

technoblogy commented 3 years ago

OK, I think I've worked out the explanation for both problems:

The first problem is caused by the GC garbage-collecting the list created from the input line during loadfromlibrary(). In the REPL I do:

    push(line, GCStack);
...
    pop(GCStack);

around the call to eval(line, env), and I should do that in loadfromlibrary(). The fix is to change this definition to:

void loadfromlibrary (object *env) {
  GlobalStringIndex = 0;
  object *line = read(glibrary);
  while (line != NULL) {
    push(line, GCStack);
    eval(line, env);
    pop(GCStack);
    line = read(glibrary);
  }
}

The second problem is caused by the fact that read-from-string and loadfromlibrary() use the same string-reading mechanism, and in particular GlobalStringIndex, as I hadn't anticipated that they would be used at the same time, and I wanted to minimise the RAM usage on smaller platforms. I'm not sure what the best solution to this is.

Thank you very much for finding these. I'll fix the first one in the next release, and think about what to do about the second.

rmu75 commented 3 years ago

Thank you for the fix and µlisp. I think I have found some additional "problems", is it preferred to open issues here or to discuss this on the forum first?

technoblogy commented 3 years ago

If it's a simple bug, report it here. If it needs discussion I think the forum is better. Thanks!