clasp-developers / clasp

clasp Common Lisp environment
https://clasp-developers.github.io/
2.58k stars 145 forks source link

Weird behavior of compiled code with cl-ppcre #1183

Closed Uthar closed 2 years ago

Uthar commented 3 years ago

Using clasp https://github.com/clasp-developers/clasp/commit/23bf6aa3dcba5f8687cd22946f3a06e195552ce3 , I'm getting such behavior when using cl-ppcre:

COMMON-LISP-USER> (ql:quickload :cl-ppcre)
To load "cl-ppcre":
  Load 1 ASDF system:
    cl-ppcre
; Loading "cl-ppcre"

(:CL-PPCRE)
COMMON-LISP-USER> (ppcre:regex-replace-all "a" "xxxxxaaaaaxxxxx" "_")

"xxxxxaaaaaxxxxx"
NIL
COMMON-LISP-USER> (load "~/quicklisp/dists/quicklisp/software/cl-ppcre-20190521-git/api.lisp")

T
COMMON-LISP-USER> (ppcre:regex-replace-all "a" "xxxxxaaaaaxxxxx" "_")

"xxxxx_____xxxxx"
T
COMMON-LISP-USER> (load "~/.cache/common-lisp/clasp-cclasp-boehm--cst-linux-x64/home/kpg/quicklisp/dists/quicklisp/software/cl-ppcre-20190521-git/api.fasp")

T
COMMON-LISP-USER> (ppcre:regex-replace-all "a" "xxxxxaaaaaxxxxx" "_")

"xxxxxaaaaaxxxxx"
NIL
COMMON-LISP-USER> (load "~/quicklisp/dists/quicklisp/software/cl-ppcre-20190521-git/api.lisp")

T
COMMON-LISP-USER> (ppcre:regex-replace-all "a" "xxxxxaaaaaxxxxx" "_")

"xxxxx_____xxxxx"
T
COMMON-LISP-USER> (load "~/.cache/common-lisp/clasp-cclasp-boehm--cst-linux-x64/home/kpg/quicklisp/dists/quicklisp/software/cl-ppcre-20190521-git/api.faso")

T
COMMON-LISP-USER> (ppcre:regex-replace-all "a" "xxxxxaaaaaxxxxx" "_")

"xxxxxaaaaaxxxxx"
NIL
COMMON-LISP-USER> (load "~/quicklisp/dists/quicklisp/software/cl-ppcre-20190521-git/api.lisp")

T
COMMON-LISP-USER> (ppcre:regex-replace-all "a" "xxxxxaaaaaxxxxx" "_")

"xxxxx_____xxxxx"
T
COMMON-LISP-USER> 

I don't know, but on a first sight seems like the pre-compiled faso/fasp's are not consistent with creating the function with defun

This does not occur for me with clasp https://github.com/clasp-developers/clasp/commit/b14e329f49

Does anyone else get this too?

(Also: Not sure if this could be related, i made a small change to include/clasp/core/character.h by replacing "en_US.UTF-8" with "" - because otherwise the build was failing in a sandbox env. But the std::locale("") call happens on run-time anyway, right? For example, if I run LANG=foo clasp, then it crashes just like in the sandbox build.)

Uthar commented 3 years ago

tried commenting out all define-compiler-macro's from api.lisp , but it didn't help

kpoeck commented 3 years ago

have that as well

kpoeck commented 3 years ago

another example:

karstenpoeck@karsten-poecks-macbook-pro clasp-karsten % build/boehmprecise/iclasp-boehmprecise  
../../src/llvmo/llvmoExpose.cc:4927:operator() Adding ObjLinkingLayer plugin for orc::createJITLoaderGDBRegistrar
Starting cclasp-boehmprecise-0.4.4-767-g23bf6aa3d-cst ... loading image...
Writing jitted symbols to /tmp/perf-5625.map
Top level in: #<PROCESS TOP-LEVEL @0x1096e6631 (Running)>.
COMMON-LISP-USER> (time (progn (require :asdf) (load "~/quicklisp/setup.lisp")))
Time real(17.736 secs) run(17.735 secs) consed(309345736 bytes) interps(503) unwinds(112471)

T
COMMON-LISP-USER> (ql:quickload :cl-ppcre)
To load "cl-ppcre":
  Load 1 ASDF system:
    cl-ppcre
; Loading "cl-ppcre"

(:CL-PPCRE)
COMMON-LISP-USER> (ppcre:split ";" "a;b")

("a;b")
COMMON-LISP-USER> (load "~/quicklisp/dists/quicklisp/software/cl-ppcre-20190521-git/api.lisp")

T
COMMON-LISP-USER> (ppcre:split ";" "a;b")

("a" "b")
COMMON-LISP-USER> 
Bike commented 3 years ago

I can't reproduce with 8b749f632ee7dfd44b9d594997e60d696d4af72f. Maybe my #1181 fix also fixed this?

kpoeck commented 3 years ago

building now to test

Uthar commented 3 years ago

for me, both examples still reproduce with 8b749f6 (Linux x86_64)

kpoeck commented 3 years ago

still see the error even after a distclean

Bike commented 3 years ago

Still can't reproduce on Linux amd64. Did you wipe fasls? I did that and then (asdf:compile-system :cl-ppcre :force :all) to be sure, but i didn't have ppcre installed before 8b749f6 anyway.

kpoeck commented 3 years ago

I did delete .cache first thing, strange. Will try on linux as well

Uthar commented 3 years ago

Same, removed ~/quicklisp and ~/.cache. Ill try a fresh rebuild on a different machine

Uthar commented 3 years ago

Hmm, still same thing on another machine. Since i dont have this error with b14e329, ill start bisecting and report back

How exactly are you building? I do

git clone https://github.com/clasp-developers/clasp
cd clasp
git checkout 8b749f632ee7dfd44b9d594997e60d696d4af72f
./waf configure 
./waf build_cboehm

without creating wscript.config or anything.

Then, in lisp, with no ~/.cache/common-lisp and no ~/quicklisp

(load "quicklisp.lisp")
(quicklisp-quickstart:install)
(ql:quickload :cl-ppcre)
(ppcre:split ";" "a;b") => ("a;b")
yitzchak commented 3 years ago

Probably won't make any difference, but we are building with ./waf build_cboehmprecise now.

Bike commented 3 years ago

I am just building the main branch. with build_cboehm, even.

kpoeck commented 3 years ago

I do ./waf distclean configure build_dboehmprecise

When Bike updates cleavir, I believe it is safer with distclean , if I feel lucky i do ./waf update_dependencies before ./waf build_boehm

update_dependenciesfetches the new cleavir (and the other dependencies).

Will try with build_cboehm now

Uthar commented 3 years ago

Bisected it, commit dcaf594c1fe8158ac36e7fb634ad79b596911a75 is the last one that works for me, i.e. this merge 023e44b54df0363044ac127305d007e82ccef6ad introduces the bug. Sadly I don't have compiler knowledge to read the changes

What do you think if we built clasp in a container with all the libs and llvm? Only bind-mounting the source. That way any environment differences would be ruled out.

I've been playing around with making such a container but still getting some errors (bad compiler flags etc.)

kpoeck commented 3 years ago

Regarding docker:

Bike commented 3 years ago

Sadly I don't have compiler knowledge to read the changes

That commit moves up the Cleavir commit a few months. In that time I made diffuse changes including altering how the IR works. That can cause hard to trace bugs like this and #1181.

phantomics commented 2 years ago

Here are two more cl-ppcre examples that don't work correctly in Clasp:

(cl-ppcre:regex-replace-all "[¯]" "¯5" "-")
(cl-ppcre:split #\J "3r4J9r5")
Bike commented 2 years ago

Assuming the correct results are "-5" T and ("3r4" "9r5"), which are the SBCL results, I am still unable to see any problem in Clasp. This is mystifying. Could it be a locale thing? That's something I don't know much about. But on the other hand the last example is all ASCII.

phantomics commented 2 years ago

Yes, those are the correct results. Sounds like a strange bug, best of luck figuring it out. The ¯ character implements a ligature in some display modes, making a line over the next character, so I can see possible problems involving Unicode categories, but there should be no problem with J in any case.

Bike commented 2 years ago

Just to be clear, what results did you see from Clasp for your two examples?

phantomics commented 2 years ago

Copied from my REPL:

COMMON-LISP-USER> (cl-ppcre:regex-replace-all "[¯]" "¯5" "-")

"¯5"
NIL
COMMON-LISP-USER> (cl-ppcre:split #\J "3r4J9r5")

Condition of type: TYPE-ERROR
0 is not of type SEQUENCE.
Available restarts:
(use :r1 to invoke restart 1, etc.)

1. (RESTART-TOPLEVEL) Go back to Top-Level REPL.

(CORE::SIGNAL-TYPE-ERROR 0 SEQUENCE)
COMMON-LISP-USER>> :r1

COMMON-LISP-USER> 
Bike commented 2 years ago

Okay, thank you.

Based on some previous experience, I suppose what could be happening is that I have extra debugging options on in my build that mask weird behavior. I will take a look in that direction barring better ideas.

kpoeck commented 2 years ago

I did a lot of tests regarding this and get different results. Test case:

gives all kind of different results:

(defun regex-replace-all (regex target-string replacement &key
                                (start 0)
                                (end (length target-string))
                                preserve-case
                                simple-calls
                                (element-type #+:lispworks 'lw:simple-char #-:lispworks 'character))
  "Try to match TARGET-STRING between START and END against REGEX and
replace all matches with REPLACEMENT.  Two values are returned; the
modified string, and T if REGEX matched or NIL otherwise.

  REPLACEMENT can be a string which may contain the special substrings
\"\\&\" for the whole match, \"\\`\" for the part of TARGET-STRING
before the match, \"\\'\" for the part of TARGET-STRING after the
match, \"\\N\" or \"\\{N}\" for the Nth register where N is a positive
integer.

  REPLACEMENT can also be a function designator in which case the
match will be replaced with the result of calling the function
designated by REPLACEMENT with the arguments TARGET-STRING, START,
END, MATCH-START, MATCH-END, REG-STARTS, and REG-ENDS. (REG-STARTS and
REG-ENDS are arrays holding the start and end positions of matched
registers or NIL - the meaning of the other arguments should be
obvious.)

  Finally, REPLACEMENT can be a list where each element is a string,
one of the symbols :MATCH, :BEFORE-MATCH, or :AFTER-MATCH -
corresponding to \"\\&\", \"\\`\", and \"\\'\" above -, an integer N -
representing register (1+ N) -, or a function designator.

  If PRESERVE-CASE is true, the replacement will try to preserve the
case (all upper case, all lower case, or capitalized) of the
match. The result will always be a fresh string, even if REGEX doesn't
match.

  ELEMENT-TYPE is the element type of the resulting string."
  (declare #.*standard-optimize-settings*)
  (let ((pos-list '())
        (reg-list '()))
    (do-scans (match-start match-end reg-starts reg-ends regex target-string
                           nil
                           :start start :end end)
      (push match-start pos-list)
      (push match-end pos-list)
      (push reg-starts reg-list)
      (push reg-ends reg-list))
    (when (and end (not (numberp end)))
      (error "the earth exploded  ~a" end))
    (if pos-list
      (values (replace-aux target-string replacement
                           (nreverse pos-list)
                           (nreverse reg-list)
                           start end preserve-case
                           simple-calls element-type)
              t)
      (values (subseq target-string start end)
              nil))))
Bike commented 2 years ago

I have an idea for why I'm not seeing this. Could somebody who has seen this problem please (declaim (optimize debug)), then force recompile of ppcre, then see if the problem still occurs?

Uthar commented 2 years ago

For me, (declaim (optimize debug)) before a (asdf:load-system :cl-ppcre :force t) makes all the examples from above work as expected

Bike commented 2 years ago

I see, thank you. I just noticed while working on something else that I'm configured to have (debug 3), which turns off some optimizations that could be causing this problem. I will look into it.

kpoeck commented 2 years ago

also works here with (debug 3). Will try your fixes from yesterday

Bike commented 2 years ago

Unfortunately I am still not seeing the problem with (debug 1). kpoeck, what are you seeing?

context: a few days ago in 4b29954269c32debc5829538448332ea384504b3 i fixed a bug that would have resulted in invalid memory reads in some code using nonlocal exits. that could result in all kinds of weird incorrect values, and hypothetically could be causing ppcre failures of this kind.

kpoeck commented 2 years ago

i still see the problem with clasp default settings ((COMPILATION-SPEED 1) (DEBUG 1) (SPACE 1) (SPEED 1) (SAFETY 1)) as well

Bike commented 2 years ago

Do you have any DEBUG_OPTIONS in your wscript.config, and if you do, what are they?

phantomics commented 2 years ago

I cloned the latest Clasp to try your solution. (declaim (optimize debug)) and then recompiling cl-ppcre solves the problem. However, when I built cl-ppcre without the debug mode in today's Clasp version, the problem persisted.

kpoeck commented 2 years ago

@Bike here my debug_options from wscript.conf

DEBUG_OPTIONS = [  "DEBUG_RELEASE"
                  ,"DEBUG_BCLASP_LISP"
                  ,"DEBUG_CCLASP_LISP"
                  ,"DEBUG_COMPILER"
                  ,"DEBUG_GUARD"
#                  ,"DEBUG_GUARD_VALIDATE"
                  ,"DEBUG_VERIFY_MODULES"
# other useful options for debugging
                  ,"USE_HUMAN_READABLE_BITCODE"
#                 ,"SOURCE_DEBUG"
#                 ,"DEBUG_SLOW",
                  ,"DEBUG_JIT_LOG_SYMBOLS"
                  ,"DEBUG_ASSERT_TYPE_CAST"
#                  ,"DEBUG_MONITOR"
#                  ,"DEBUG_MONITOR_SUPPORT"
# Other settings
#                 ,"CST"
#                 ,"DISABLE_CST"
#                 ,"DEBUG_LLVM_OPTIMIZATION_LEVEL_0"
# MPS debug settings
#                 ,"DEBUG_RECURSIVE_ALLOCATIONS"
#                 ,"DEBUG_MPS_SIZE"
#                 ,"DEBUG_ALLOC_ALIGNMENT"
                  ]
Bike commented 2 years ago

Got it to reproduce. Don't know why or how it's only happening now.

Reduced test case:

(defun test2 (list)
           (let (col)
             (block nil
               (tagbody
                  loop
                  (multiple-value-call
                      (lambda (val)
                        (print val)
                        (print col)
                        (unless val (return nil))
                        (push val col))
                    (pop list))
                  (go loop)))
             (print col)
             col))

(test2 '(1 2 3 4 5)) should return (5 4 3 2 1) but instead returns nil. based on the print statements, it looks like col is accumulated properly, but somehow the return sets it back to nil.

Bike commented 2 years ago

github's auto-closer thing is more optimistic than I am. Could somebody please try to reproduce the original regex problem with 9c216a5?

phantomics commented 2 years ago

I just pulled and compiled the latest. All the cl-ppcre tests now work correctly, as does the test2 function above. If no one else has problems it's safe to say it's fixed.

Uthar commented 2 years ago

Works for me too

Bike commented 2 years ago

fantastic.