ruricolist / serapeum

Utilities beyond Alexandria
MIT License
420 stars 41 forks source link

Add STATIC-LET and STATIC-LET* #108

Closed phoe closed 2 years ago

phoe commented 2 years ago

Closes #107.

TODO:

phoe commented 2 years ago

OK, Travis tells me all is good.

What kind of multithreaded tests do we want for this?

phoe commented 2 years ago

Some basic tests are done. Is this the API that we are aiming for?

I'm asking particularly in context of the possibility of creating a binding that we will then not be able to flush via a flush-static-bindings called with no flushing group (meaning that it should flush everything). Do we want to allow this, or would we rather have a possibility to nuke all such values from the orbit even if they haven't been made by us (e.g. when dumping an image)?

phoe commented 2 years ago

I've added separate toggles for whether a binding should be flushable (not on by default; only meant to be used if the contents of the binding are meant to survive image dumping) and, if yes, in which group a binding should be flushed.

The tests should now reflect all of the API other than for multithreading (which I have no idea how to properly test).

ruricolist commented 2 years ago

This looks good overall. Some concerns, in descending order of importance.

I wouldn't worry about testing synchronization. This is simple enough that all we would really be doing is testing the Lisp implementation.

phoe commented 2 years ago
phoe commented 2 years ago

All done - everything except for three docstrings should be reviewable. (And I have no ability to write prose left in me at the moment, apologies; it'll need to wait a little bit before I make some decent documentation for those.)

phoe commented 2 years ago

One more question: if e.g. :in is specified, should we guard against evaluating it multiple times? It would probably complicate the code a little bit more. Yes - fixed.

ruricolist commented 2 years ago

I'm satisfied with this. I'll merge it as soon as there's documentation.

phoe commented 2 years ago

Docstrings now ready for review.

@temporal How's this looking for a "production-grade" implementation? :D

phoe commented 2 years ago

One minor refactor made to make the code a bit easier to understand when reading top-down.

TeMPOraL commented 2 years ago

I'm not experienced enough to speak about the synchronization code; beyond that, this looks solid and very much production-grade to me :). I appreciate the code clarity and the amount of documentation, both in docstrings and inline comments.

ruricolist commented 2 years ago

Two of the rewritten receive tests are failing on ECL. There seems to be an inconsistency in behavior between lambdas compiled inline and lambdas created through compile+eval. This may be a bug in ECL itself.

phoe commented 2 years ago

Oops. I will restore the receive rests and resubmit them in a separate PR. Done, #109 was opened.

phoe commented 2 years ago

I found one catch when using this style of static bindings. In particular, it is possible to execute rug pulls by uninitializing a binding while it is still in scope.

(static-let ((x 42 :in 'foo :type integer))
  (flush-static-binding-group 'foo :are-you-sure-p t)
  x ;; no longer an integer, but NIL! we are in UB zone
  )

Should this wart be in any way documented? Should we try to prevent that from happening via some dynavar-based mechanism?

I have an idea for implementing without consing on the heap, in linear time when it comes to flushing a single group and in constant time when it comes to flushing all groups.

I can add it if it's worth it (and I think it is, in order to avoid UB). We can additionally avoid inserting such checks in code compiled with zero safety.

ruricolist commented 2 years ago

Yes, that's nasty, let's avoid that.

My first thought is to redefine the the flush- functions as macros and macrolet them in static-let so they signal errors on expansion. But if you have a better plan go ahead.

phoe commented 2 years ago

My first thought is to redefine the the flush- functions as macros and macrolet them in static-let so they signal errors on expansion.

That'll fail the moment we cross the lexical boundary. See e.g. this:

(defun foo ()
  (flush-static-binding-group 'foo :are-you-sure-p t))

(static-let ((x 42 :in 'foo :type integer))
  (foo)
  x ;; no longer an integer, but NIL! we are in UB zone
  )

I want to do something similar that I do in phoe/amb - see *started-ambs*. Inside each static-let/static-let*, we will rebind a dynamic variable by consing the binding groups of all flushable bindings (sans duplicates) on top of that list.

Once that's done:

In both cases, the errors may be continuable, just so the programmer can choose to be smarter than our code.

As for optimizations, we can use list* instead of adjoin in order to not need to search for duplicates, that'll give us constant time complexity at binding time. We can allocate the conses via dynamic-extent to avoid consing on the heap.

Also, we probably don't need to do any of this if :type is undeclared, because in the worst case we should get a type error of a NIL being passed somewhere it shouldn't be.

How does it sound? If it's OK, I'll work on this as soon as have a while - so likely Wednesday.

phoe commented 2 years ago

Found a while today morning and implemented the basic solution, added two tests for testing that in lexical and dynamic scope.

phoe commented 2 years ago

Ah shit, here we go again.

STATIC-LET* has a shadowing issue because currently all bindings are made before they are initialized.

CL-USER> (static-let ((x 42))
           (static-let ((x x))
             x))
42

CL-USER> (static-let* ((x 42))
           (static-let* ((x x))
             x))
NIL

To properly fix this, I might actually need to implement it in terms of static-let and do some declaration munging to make it happen. Done. Thanks for the earlier pointer regarding partition-declarations!

phoe commented 2 years ago

OK, I think it's as final as it can be now. :D

(I also have a literate-programming-centric article related to static-let that I want to publish once this is merged. The link for now is here if anyone would like to make a pre-release review; after publishing, it'll be moved to the main branch.)

phoe commented 2 years ago

Is there anything left for me to do at the moment, or should I wait for a free while for reviewing&merging from your side?

ruricolist commented 2 years ago

There have been enough changes I want to go over it one more time before I hit merge.

phoe commented 2 years ago

OK - agreed, and thanks a lot.

ruricolist commented 2 years ago

I've merged this (although GitHub doesn't realize it's merged, alas) with a few small changes. The only one that changes the behavior is that the package used as the default group is the macroexpansion time package, not the run-time package. I've also added static checks that the bindings established by static-let can't be declared dynamic-extent or mutated with setf.

phoe commented 2 years ago

Thanks!

I have mixed feelings about the setf part, since it feels a bit like an artificial limitation. (And trivial to work around, just stuff the "real" object in a cons cell and mutate its car.)

Reasoning: in C, it's possible to mutate a static local variable and replace it altogether with a new object, so the same should - in theory - be possible with Lisp static bindings. Example use cases - replacing one array object with another, at which point the previous object can become garbage-collectible.

ruricolist commented 2 years ago

I'm concerned about the interaction between static-let and other macros. In particular the case where person X writes a macro that creates a binding with static-let and person Y, who's using the macro, doesn't understand how the binding is different from a regular binding.

phoe commented 2 years ago

I don't think there is much to worry about in this case from your position as a Serapeum maintainer:

I think it would be more lispy in spirit to empower and trust the programmer, rather than try to prevent them from doing things they think they have a valid use case for. And IMO if people find use cases for writing to static local variables in C, they're gonna find those for CL.

ruricolist commented 2 years ago

Those are good points. I'll give it more thought, but I'd prefer to leave them immutable for the moment since I can make them mutable anytime but going back and making them immutable later would be very difficult.

On Sat, Jan 29, 2022, 12:35 PM phoe @.***> wrote:

I don't think there is much to worry about in this case from your position as a Serapeum maintainer:

  • From the multiprocessing point of view, writing to a static binding is essentially writing to a structure slot, which should be atomic on all implementations;
  • If a programmer creates a macro that wraps static-let and fails to document that the binding it creates is going to be static/shared between invocations, I don't think it's the responsibility of Serapeum to try and fix that issue for them;
  • Semantics of writing to a non-constant load-time-value object are well-defined in standard CL so there's no conformance issue,
  • (static-bind ((#1=#:x (list (make-large-object)))) (symbol-macrolet ((x (car #1#))) (setf x (make-another-large-object)))) easily defeats the current setf safeguard.

I think it would be more lispy in spirit to empower and trust the programmer, rather than try to prevent them from doing things they think they have a valid use case for. And IMO if people find use cases for writing to static local variables in C, they're gonna find those for CL.

— Reply to this email directly, view it on GitHub https://github.com/ruricolist/serapeum/pull/108#issuecomment-1024964091, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC7AKFPBHVP4AME54NMF6TUYQXORANCNFSM5MSQULQA . You are receiving this because you modified the open/close state.Message ID: @.***>

phoe commented 2 years ago

OK, that's reasonable. I've opened #112 to keep the topic open for now.

Thanks for reviewing and merging! (I completely forgot about dynamic-extent myself, that declaration makes no sense for a static binding of indefinite extent.)