Clozure / ccl

Clozure Common Lisp
http://ccl.clozure.com
Apache License 2.0
847 stars 103 forks source link

Implement package local nicknames in CCL [$260] #111

Closed pfdietz closed 5 years ago

pfdietz commented 6 years ago

Three Common Lisp implementations (SBCL, ECL, and ABCL) have adopted an extension to defpackage, :local-nicknames. This extension helps solve the serious problem of package (and, in particular, nickname) collisions between different CL libraries. If CCL (and, perhaps, CLISP) adopt this extension also, then the system on quicklisp can be encouraged to use it and avoid most of the collisions that currently plague it.

The syntax is:

(:local-nicknames (nickname package) ...)

where nickname is a designator for a nickname, and package a designator for a package name.

When package is bound to a package that has local nicknames, the reader recognizes the local nicknames in that package as mapping to the corresponding actual packages. The printer is allowed to use local nicknames when package allows it to, but is not required to (it may use real package names instead).

--- Did you help close this issue? Go claim the **[$260 bounty](https://www.bountysource.com/issues/55429455-implement-package-local-nicknames-in-ccl?utm_campaign=plugin&utm_content=tracker%2F27935804&utm_medium=issues&utm_source=github)** on [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F27935804&utm_medium=issues&utm_source=github).
xrme commented 6 years ago

see also #67

Shinmera commented 5 years ago

For what it's worth, Clasp also supports this extension now, so this is (finally) growing into a cross-implementation extension.

Shinmera commented 5 years ago

Also note that there is a small, additional protocol to query and manipulate the local nicknames. See the SBCL manual

phoe commented 5 years ago
22:00 < jackdaniel> for interested folks: files to tweak: lib/macros.lisp (defpackage), level-1/l1-symhash.lisp 
                    (%define-package), level-1/l1-reader.lisp (#\: reader macro)
phoe commented 5 years ago

I will first create a test suite that tests the package local nicknames functionality across a variety of implementations to ensure that there are no differences in interpreting the intent and that we have cross-implementation consistency.

Then, based on this established interface, I will attempt to implement this in CCL.

https://github.com/phoe/package-local-nicknames-tests

phoe commented 5 years ago

Some bad news, folks.

With the introduction of package-local nicknames, (intern "FOO" :bar) may now intern in completely different packages based on the state the package system at runtime, since it depends on whose local nickname :BAR is at the moment.

And this seems to completely break CCL's compile-time package lookup optimization.

I got a confirmation from someone that SBCL basically has to do package lookup at runtime, for example to make do-symbols and friends work.

Everyone: do we want to make package lookup slower in CCL in exchange for implementing PLNs?

I'm a very bad person to answer this question, since 1) I have little to no experience in CCL, 2) I do not know the full history behind package lookup optimization, 3) I'm incentivized with the bounty I'm headed for.

xrme commented 5 years ago

@gzacharias do you have any feedback on this?

phoe commented 5 years ago

I have worked around this.

The optimizer compiler macros affected by this now expand into the following form:

(if (package-local-nicknames *package*)
  (slow-call)
  (fast-call))

Where (fast-call) is what the macro would otherwise output and slow-call is a full function call.

I think this is the best we can do while making it possible to use PLNs.

pfdietz commented 5 years ago

That optimization was always suspect, because of DELETE-PACKAGE.

Code that really wants to be fast can portably write (intern string #.(find-package :bar))

EDIT: or, rather, (intern string (load-time-value (find-package :bar)))

phoe commented 5 years ago

I am running into building issues at #185, but other than, that I have some progress with my fork of CCL at https://github.com/phoe-trash/ccl

Hotpatching CCL via loading code from https://plaster.tymoon.eu/view/1140#1140 and then running the test suite from https://github.com/phoe/package-local-nicknames-tests gives me the following results:

;; TEST-PACKAGE-LOCAL-NICKNAMES-NICKNAME-REMOVAL-READD-ANOTHER-SYMBOL-PRINTING:
;;;; Failed assertion: (EQUAL "L:CONS" (PRIN1-TO-STRING CONS0))
;; TEST-PACKAGE-LOCAL-NICKNAMES-SYMBOL-PRINTING:
;;;; Failed assertion: (EQUAL "L:CONS" (PRIN1-TO-STRING CONS0))
;; TEST-PACKAGE-LOCAL-NICKNAMES-NICKNAME-REMOVAL-REMAINING:
;;;; Failed assertion: (EQUAL +SYM-FULLNICKNAME+ (PRIN1-TO-STRING EXIT0))
;; TEST-PACKAGE-LOCAL-NICKNAMES-NICKNAME-REMOVAL-READD-ANOTHER-SYMBOL-EQUALITY:
;;;; Failed assertion: (NOT (EQ 'CONS CONS0))
;;
;; 16 tests run, 4 failures.

The first three are related to the CCL printer (which I have not yet modified). The last one is something related to deleting/readding symbols that I will need to investigate, since it is a correctness issue in the code that I have already modified.

phoe commented 5 years ago

I am having CCL bootstrapping issues described at https://github.com/Clozure/ccl/issues/186.

phoe commented 5 years ago

I have fixed the boostrapping issues at #186. I am now at three failures out of 16, all three of which are related to the not-yet-done symbol printer. I am working on this now.

phoe commented 5 years ago

My PR is ready for review.

https://github.com/Clozure/ccl/pull/188

pfdietz commented 5 years ago

I do not build CCL myself, so if someone who does that can confirm this works?

Thank you, Michał!

phoe commented 5 years ago

@pfdietz Added "check if this builds for anyone else" to the TODO list of the PR.

svspire commented 5 years ago

I vaguely recall an issue with symbol interning and lookup that was giving the ACL2 folks trouble; they needed symbol lookup to be as fast as possible because they created a lot of symbols at runtime. I hope this change isn't a reversion for them.

phoe commented 5 years ago

@svspire The optimizers still work and package lookups are still cacheable as long as someone does not use package-local nicknames. Therefore, legacy code should still be able to make use of optimized package lookups.

However, these optimizers are now less efficient. The change I have made introduces the additional cost of one function call, one hash table lookup, and one NIL-comparison per each package lookup.

Do you have any contact to those people? We/they should measure the impact of this change on their software. If the impact is nonetheless considerable, then package-local nicknames should become optional in CCL, perhaps toggleable via a compile-time flag.

phoe commented 5 years ago

@svspire dlowe on Freenode also suggested one optimization for package lookup that is orthogonal to package caching. It can likely be implemented in CCL. I have added it as a TODO comment in code in 50135c8c and I am reposting it here for visibility.

;;; TODO: Optimize package lookup.
;;; According to dlowe from freenode, we do not need to use STRING= in PKG-ARG or the explicit
;;; AREF loop in %FIND-PKG. We can consider all nicknames to be names of symbols interned in an
;;; internal package. This means that, after interning the symbol name, we will be able to avoid
;;; comparing symbol name lengths and contents and instead use the package system to immediately
;;; fetch the proper symbol (and therefore its name). This should considerably speed up package
;;; lookup and possibly mitigate the performance hit incurred by the introduction of package-local
;;; nicknames.
svspire commented 5 years ago

Warren Hunt is the ACL2 project leader I believe (hunt@cs.utexas.edu). If you contact him, he'll be able to connect you with the appropriate member of the team to do a benchmark test.

phoe commented 5 years ago

@svspire Thanks - sent him a mail. I'll keep this thread updated as soon as I know anything new.

phoe commented 5 years ago

@svspire I've done my preliminary testing using ACL2.

Good news: package-local nicknames have not affected ACL2 test execution - the results of 1.12.1.dev and 1.12.1.dev with PLNs are very similar. Bad news: 1.12.1.dev has performance regressions, compared to 1.11.5, even without my patches. Worse news: 1.12.1.dev is buggy. In ACL2: make certify-books returns a nonzero error level.

CCL 1.11.5:
real    193m55,838s
user    679m1,862s
sys 8m42,055s

CCL 1.12.dev without PLNs:
real    226m8,535s
user    792m5,427s
sys 9m28,350s

CCL 1.12.dev with PLNs:
real    225m41,707s
user    796m12,919s
sys 9m26,029s

I will re-run the ACL2 tests with logging to see where exactly the errors have happened.

I think this is worth a separate ticket - I'll create it as soon as I have the results.

xrme commented 5 years ago

When testing ACL2, be sure you are running a CCL built from the very tip of master.

If there was a problem with ACL2, I would expect @MattKaufmann to tell me so.

phoe commented 5 years ago

This is my error, then! I have been testing the bootstrapping binary.

I will test the tip of master tomorrow.

phoe commented 5 years ago

OK, the test has run overnight and completed successfully. Thanks for the pointer, @xrme.

CCL v1.12-dev.4-4-gd9740256
real    217m48,620s
user    766m15,539s
sys     8m44,711s

So we have 193m55,838s for CCL 1.15, 217m48,620s for current master, 225m41,707s for current master with package-local nicknames. This means that adding PLNs has made the time eight minutes worse than the master baseline, but also that current master is 18 minutes worse than CCL 1.11.5.

phoe commented 5 years ago

One more test with PLNs:

v1.12-dev.4-15-g082ef8e9
real    224m50,577s
user    786m4,429s
sys     8m56,998s

So, yes, my patch has several minutes' worth of impact on ACL2 certify-books.

MattKaufmann commented 5 years ago

Regarding:

If there was a problem with ACL2, I would expect @MattKaufmann to tell me so.

In fact, thorough testing is done before any changes to ACL2 or its libraries ("community books") go into github.

"R. Matthew Emerson" notifications@github.com writes:

[1:text/plain Hide]

When testing ACL2, be sure you are running a CCL built from the very tip of master.

If there was a problem with ACL2, I would expect @MattKaufmann to tell me so.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/Clozure/ccl/issues/111#issuecomment-459877140

[2:text/html Show]

phoe commented 5 years ago

@MattKaufmann Thanks for confirming. I don't know if you test ACL2 with the most fresh versions of CCL though, as I'm a foreigner to your testing processes.

MattKaufmann commented 5 years ago

@phoe Good point. No, we don't keep CCL up-to-date for our ACL2 testing (it's updated only on occasion).

phoe commented 5 years ago

@MattKaufmann It might be worthwhile - I see there is a performance regression between CCL1.11.5 and the current master.

phoe commented 5 years ago

@MattKaufmann Another note: with implementations that have a stable and frequent release cycle, such as SBCL, you could perhaps perform a set of basic tests (correctness + performance) once per release, but with implementations liike CCL which make official releases rarely and are instead built from bleeding-edge sources, I think you'd need to come up with your own ACL2 testing schedule.

MattKaufmann commented 5 years ago

@phoe @xrme The slowdowns are concerning, but I know nothing about CCL maintenance (hence I don't know how to act on the slowdowns). Perhaps the CCL maintainers would consider testing with a specific version of ACL2 but updated CCL versions, to catch CCL changes that slow down ACL2.

To see which specific tests ("book certifications") slowed down between two ACL2 regression tests, there are instructions near the end of section "Regression testing" in:

http://www.cs.utexas.edu/users/moore/acl2/manuals/current/manual/index.html?topic=ACL2____DEVELOPERS-GUIDE-MAINTENANCE

MattKaufmann commented 5 years ago

@phoe One more thing: I've noticed over the years that timings on my Mac vary wildly, but timings on Linux boxes (at least, the ones I use) seem much more reliable. When an ACL2 change slows it down by even 1% on Linux, I'm concerned; but I don't pay much attention to slowdowns of on my Mac, which I think can run as high as 15% or maybe even 20%.

phoe commented 5 years ago

As a way to avoid runtime slowdown for ACL and other applications that do not use package-local nicknames, should PLNs be added to CCL as a compile-time build option?

pfdietz commented 5 years ago

I want to understand what CCL is doing here, in this optimization. Is it taking calls that look like:

(intern str "FOO")

and, in effect, compiling them like

(intern str (load-time-value (find-package "FOO")))

If so, an application like ACL2 could just do that itself, couldn't it? And there'd be no need for a standard-violating optimization. And this optimization does violate the standard, since it's wrong if someone later does (delete-package "FOO") and then defines the package again.

(Or is the optimization smarter, and keep a reference to the package in a variable that gets updated when the package is redefined?)

EDIT: ok, I see. There's a map from names/nicknames to package-refs, in which a link to the corresponding package objects is dynamically maintained. Suggestion: augment this structure with a bit indicating the string is involved in PLN somewhere, and abort back to slow lookup in that case.

MattKaufmann commented 5 years ago

@pfdietz I agree that "ACL2 could just do that itself" to some extent: it has such a primitive, intern$, but also ACL2 is kind of stuck with supporting calls to intern on five specific package names. Here is further explanation:

ACL2 provides its own read-eval-print loop, which interprets its supported subset of Common Lisp; but also, ACL2 ultimately invokes the Common Lisp compiler, so it cannot give alternate interpretations to Common Lisp primitives. One such primitive supported by ACL2 is intern, but only when the second argument is "ACL2", "COMMON-LISP", or one of three other strings that name packages. So I think the slowdown could be avoided if there were a way for me to tell ACL2 to make intern fast for those five package names.

ACL2 also provides a primitive, intern$, which is essentially intern but indeed does fast lookup to get a package from a package name. So another solution may seem to be for ACL2 to prohibit the use of intern, insisting on intern$ instead. But there are probably too many ACL2 applications out there at this point for that to be feasible: I counted 335 calls in the user-maintained libraries but there are perhaps many more in proprietary libraries.

pfdietz commented 5 years ago

Matt: I see now that the optimization is standard-conforming, so asking ACL2 to jump through hoops to maintain performance could be impolite. I think the suggestion at the end of previous post could be good enough, as it would cause only minor slowdown (one extra test and branch) in the case of no PLNs.

MattKaufmann commented 5 years ago

@pfdietz I see; thanks for your help with this!

pfdietz commented 5 years ago

Another possibility: extend package-ref structures to include a pointer to the value of *package* when the lookup was performed, and check that the current *package* is the same. If not, do the slow lookup and cache it (and *package*).

phoe commented 5 years ago

I do not think I am competent enough in CCL to perform these optimizations straight away.

Also, as much as I consider these discussions to be important, should they not be moved to a separate issue? I consider the discussions for a) implementing package-local nicknames in CCL, b) optimizing package lookup with package-local nicknames, interconnected, but distinct.

pfdietz commented 5 years ago

The question is: will CCL maintainers consider this acceptable to merge to master if it causes a performance hit for ACL2? I'd be ok with it (and consider that merge the trigger for the bounty) but would they?

Alternately, it can be put up on a branch, and others (me?) can try to add one of those optimizations.

phoe commented 5 years ago

Sidenote: I'll create another ticket related to performance regressions between CCL 1.11.5 and current master.

MattKaufmann commented 5 years ago

As I understand it, the addition of package-local nicknames slowed down ACL2 from 217m48,620s to 225m41,707s, which is an additional 3.6%. I very much hope that this slowdown will be eliminated, or nearly so, before incorporating this into CCL, as the ACL2 community relies on CCL and performance is important.

phoe commented 5 years ago

That is correct, and I hope the same. I do not yet know how to eliminate that overhead, but @pfdietz seems to have some strategy in mind.

xrme commented 5 years ago

I'm very hesitant to accept a change that causes that much of a slowdown in ACL2.

MattKaufmann commented 5 years ago

@xrme Thank you! A compile-time build option (mentioned by @phoe as something maybe to consider) could be fine for ACL2 (we just wouldn't use it). Also, as I mentioned earlier, I wonder if repeated testing would show less slowdown. Anyhow, I really appreciate the concern for ACL2 performance using CCL.

phoe commented 5 years ago

@xrme Does CCL have any kind of build-time conditionalizing at the moment? If yes, I will be able to hook into that mechanism with package-local nicknames; if not, it will have to be created.

phoe commented 5 years ago

Just in case I have screwed up badly somewhere, I am now testing ACL2 with CCL with compiler-macros for package-related functions turned off. In case I have completely broken these macros, I should get a result consistent with the result for PLNs that I have posted up above.

pfdietz commented 5 years ago

I've created a ticket over on SBCL's bug tracker to implement CCL's package lookup optimization. Because SBCL supports PLNs, the ticket is to use the first implementation idea I mentioned above.

https://bugs.launchpad.net/sbcl/+bug/1814924

Dougk implemented a less impactful optimization.

All of these need to worry about what happens with multithreading.

phoe commented 5 years ago

Wait a second. Am I interpreting these times correctly at all?

This is the diff I have made:

diff --git a/compiler/optimizers.lisp b/compiler/optimizers.lisp
index f5197584..4557baef 100644
--- a/compiler/optimizers.lisp
+++ b/compiler/optimizers.lisp

-(define-compiler-macro intern (&whole w string &optional package &environment env)
-  (let* ((ref (package-ref-form package env)))
-    (if (or ref
-            (setq ref (and (consp package)
-                           (eq (car package) 'find-package)
-                           (consp (cdr package))
-                           (null (cddr package))
-                           (package-ref-form (cadr package) env))))
-      `(if (package-%local-nicknames *package*)
-         (locally (declare (notinline intern)) ,w)
-         (%pkg-ref-intern ,string ,ref))
-      w)))
-
-(define-compiler-macro find-symbol (&whole w string &optional package &environment env)
-  (let* ((ref (package-ref-form package env)))
-    (if (or ref
-            (setq ref (and (consp package)
-                           (eq (car package) 'find-package)
-                           (consp (cdr package))
-                           (null (cddr package))
-                           (package-ref-form (cadr package) env))))
-      `(if (package-%local-nicknames *package*)
-         (locally (declare (notinline find-symbol)) ,w)
-         (%pkg-ref-find-symbol ,string ,ref))
-      w)))
-
-(define-compiler-macro find-package (&whole w package &environment env)
-  (let* ((ref (package-ref-form package env)))
-    (if ref
-      `(if (package-%local-nicknames *package*)
-         (locally (declare (notinline find-package)) ,w)
-         (package-ref.pkg ,ref))
-      w)))
-
-(define-compiler-macro pkg-arg (&whole w package &optional allow-deleted errorp &environment env)
-  (declare (ignore errorp))
-  (let* ((ref (unless allow-deleted (package-ref-form package env))))
-    (if ref
-      (let* ((r (gensym)))
-        `(if (package-%local-nicknames *package*)
-           (locally (declare (notinline pkg-arg)) ,w)
-           (let* ((,r ,ref))
-             (or (package-ref.pkg ,ref)
-                 (%kernel-restart $xnopkg (package-ref.name ,r))))))
-      w)))
+;; (define-compiler-macro intern (&whole w string &optional package &environment env)
+;;   (let* ((ref (package-ref-form package env)))
+;;     (if (or ref
+;;             (setq ref (and (consp package)
+;;                            (eq (car package) 'find-package)
+;;                            (consp (cdr package))
+;;                            (null (cddr package))
+;;                            (package-ref-form (cadr package) env))))
+;;       `(if (package-%local-nicknames *package*)
+;;          (locally (declare (notinline intern)) ,w)
+;;          (%pkg-ref-intern ,string ,ref))
+;;       w)))
+
+;; (define-compiler-macro find-symbol (&whole w string &optional package &environment env)
+;;   (let* ((ref (package-ref-form package env)))
+;;     (if (or ref
+;;             (setq ref (and (consp package)
+;;                            (eq (car package) 'find-package)
+;;                            (consp (cdr package))
+;;                            (null (cddr package))
+;;                            (package-ref-form (cadr package) env))))
+;;       `(if (package-%local-nicknames *package*)
+;;          (locally (declare (notinline find-symbol)) ,w)
+;;          (%pkg-ref-find-symbol ,string ,ref))
+;;       w)))
+
+;; (define-compiler-macro find-package (&whole w package &environment env)
+;;   (let* ((ref (package-ref-form package env)))
+;;     (if ref
+;;       `(if (package-%local-nicknames *package*)
+;;          (locally (declare (notinline find-package)) ,w)
+;;          (package-ref.pkg ,ref))
+;;       w)))
+
+;; (define-compiler-macro pkg-arg (&whole w package &optional allow-deleted errorp &environment env)
+;;   (declare (ignore errorp))
+;;   (let* ((ref (unless allow-deleted (package-ref-form package env))))
+;;     (if ref
+;;       (let* ((r (gensym)))
+;;         `(if (package-%local-nicknames *package*)
+;;            (locally (declare (notinline pkg-arg)) ,w)
+;;            (let* ((,r ,ref))
+;;              (or (package-ref.pkg ,ref)
+;;                  (%kernel-restart $xnopkg (package-ref.name ,r))))))
+;;       w)))

And this is the time:

CCL 1.12 master with compiler macros turned off
real    215m23,514s
user    760m46,450s
sys 8m44,299s

So, I have commented out the compiler macros for intern, find-symbol, find-package, and ccl::pkg-arg, rebuilt CCL, and run ACL2's certify-books. And the time I got with these macros commented out is very similar to the time I got with these macros not commented out.

I do not know how to interpret these results, and the long time of book certification is making it hard for me to repeat those tests. I need a better way to measure the speed of package accesses in CCL.

MattKaufmann commented 5 years ago

@phoe I'm assuming that you rebuilt ACL2 on top of the modified CCL. Did you also check for failures? The following should have no output:

fgrep -a '**' <make-certify-book-log-file>

You can do a much faster test in the top-level ACL2 directory, by the way, by running with make target certify-books-short instead of certify-books.