Review usage of assert - Githubissues

lread commented 3 years ago

Raised me in rewrite-cljc-playground, see https://github.com/lread/rewrite-cljc-playground/issues/59

Much thanks to @sogaiu for bringing this up on Slack.

Rewrite-clj (and therefore rewrite-cljc) makes use of assert, mostly to validate node creation, and therefore also for general parsing. The most common assertion is that a node has the expected number of children.

@sogaiu highlighted the following interesting points (which I am paraphrasing, hopefully correctly):

the naive use of assert can lead to unintented results as described in "Use of Assertions" by John Regehr from which @sogaiu quoted:

if we are writing library code, we must be even more careful to use assertions correctly than if we are writing a standalone application. The only time to assert something in library code is when the code truly cannot continue executing without dire consequences.

@sogaiu also came across "Beware of Assertions" by Alexander Yakushev
@borkdude effectively disables these asserts for some of his projects, for example here's how he does so for clj-kondo, mostly for performance.

If we take point 1 to heart, we can argue that rewrite-cljc really shouldn't be making use of assert. It should not be deciding that the situation of encountering invalid Clojure is dire for the calling library/application.

Note that I did replace all explicit Exception throws with ex-info exceptions for rewrite-cljc, for cross-platform support between clj and cljs - so I do have a precedent for replacing other types of exceptions. But I am not sure what makes sense moving forward. Some initial ideas:

do nothing and document - this preserves backward compatibility
replace with explicit throws - this would allow for a catchable Exception but would break behavior compatibility for folks who want to disable checks for performance (or other) reasons.
replace with spec - but spec is not normally enabled in production, so backward compatibility behavior would be an opt-in, and depending on how far we go with spec, might more coarser grained that a caller would like?

lread commented 3 years ago

Issue comments from rewrite-cljc-playground:

@borkdude: Please don't replace with explicit throws.

Also please don't replace with spec, unless the specs are in a different namespace and optional to load.

@borkdude: Elaborating:

The checks in asserts are mostly useful for developers. They should be elidable at compile time for performance. For the same reason spec checking should be able to turned off in production environments.

It's nice to get some extra checking during development, but you should be able to turn that off when compiling a final artifact. E.g. linters and formatters should get the ultimate performance possible, since they run on every keystroke.

So I would really do nothing at this point, since people are relying on being able to turn asserts off.

@lread: Thanks @borkdude, much appreciated.

Question to self: if asserts are disabled is rewrite-clj able to parse invalid (or less valid) Clojure code? (Note that this is not necessarily a bad thing).

@borkdude: @lread Note that clj-kondo has been running without asserts for its entire lifetime. So far I haven't had any issues. I think the asserts are mostly there for users creating their own nodes. E.g. for debugging clj-kondo hooks code this could be useful feedback, but that's on clj-kondo to fix, since I've decided to turn the asserts off. In the current API there aren't any asserts that people would miss out on, though.

Asserts aren't bad, they are designed exactly for this purpose.

@lread: Very much related is the usage of :pre on some functions, for example, which are effectively assertions. Not suggesting we do anything, just taking notes.

@sogaiu: Ah, I didn't realize that about :pre -- thanks for sharing!

vemv commented 3 years ago

This one is biting me as I'm trying to cut a new refactor-nrepl release. See e.g. https://github.com/clojure-emacs/refactor-nrepl/issues/332

It'd be pretty bad to leave users with errors that can't quite be debugged.

*assert* (the dyn var) is OK-ish, assert (the macro) not so much because it doesn't tell us the value at fault.

It's fairly trivial to replace (assert) and :pre with a helper such that if the expectation fails, the value at fault is reported. There's https://github.com/ptaoussanis/truss for example.

Probably for refactor-nrepl I'd be OK with the performance hit of checking preconditions. I prefer correct code that can be debugged to opaque NPEs or such. Also for refactor-nrepl the bottleneck is tools.analyzer, typically it will trump any other cost.

You might want to define your own *assert* that consumers can customize at will, I guess that one size doesn't necessarily fit all.

Cheers - V

lread commented 3 years ago

Thanks for adding to the issue @vemv!

This is all a bit foggy so I shall recap and paraphrase:

A while ago @borkdude helped me to understand that he disables assertions for production use for performance reasons. Here's what he does for clj-kondo for example.
You are interested in leaving the existing assertions active but would appreciate better error reporting to help with diagnosis.

Since :pre is disabled by setting*assert* to false, and it will give you more context, I can carefully look into this. Although truss looks interesting (thanks for sharing!), I remain extremely conservative about bringing in any new dependencies into rewrite-clj.

vemv commented 3 years ago

Hey there!

disables assertions for production use for performance reasons.

The nuance is that *assert* is a global var and software like refactor-nrepl (and even clj-kondo, depending on how you use it) shares the JVM with the end-user environment.

So it's easy to couple things. End users may tweak *assert* at will for unrelated reasons. Other tools also consuming rewrite-clj can have a different expectation/preference as to whether check asserts.

I think that a minimalistic solution would look like this:

(defn foo [x]
  (when *assert*
    (or (string? x)
        (throw (ex-info "..." {:faulty-value x})))))

...It's not merely a hand-rolled :pre: it moves the *assert* querying from compile-time to runtime. That way, rewrite-clj consumers can meaningfully bind *assert* and get custom behavior from rewrite-clj.

(If going this direction I'd recommend defining your own *assert* var for further decoupling)

I can contribute a POC PR with this setup upgrading a few :pres to demonstrate this pattern. WDYT?

borkdude commented 3 years ago

Note that moving checks from compile time to runtime isn't exactly better for performance, especially not when deref-ing dynamic vars, but perhaps in the grand scheme of things it doesn't matter that much.

Perhaps just moving the checks to a rewrite-clj.specs namespace that you can optionally load / instrument if you want spec checking for these functions is the way to go in this day and age.

vemv commented 3 years ago

Adding spec to the mix seems to go against I remain extremely conservative about bringing in any new dependencies into rewrite-clj. With spec1/spec2 uncertainty, spec seems bit of delicate one especially as end users aren't fully in control.

Also spec tempts one to bring in Orchestra or such, most devs out there appreciate checking the return value etc. Which can further complicate things. tldr Spec is not exactly agnostic/minimalistic for this use case.

Note that moving checks from compile time to runtime isn't exactly better for performance

I was thinking of a 'double-checked' pattern i.e. first you check against clojure.core/*assert* at compile-time and then against your own *assert* at runtime. That way one elides any cost.

vemv commented 3 years ago

I was thinking of a 'double-checked' pattern i.e. first you check against clojure.core/assert at compile-time and then against your own assert at runtime. That way one elides any cost.

Using more precise wording, if clojure.core/*assert* is false at compile-time, you macroexpand to 'nothing'.

borkdude commented 3 years ago

Adding spec to the mix seems to go against I remain extremely conservative about bringing in any new dependencies into rewrite-clj.

I don't really agree with this one: spec is an already available dependency and loading the specs can (and imo should) be made optional for users by providing them in a separate namespace. Instrumentation can be chosen at a fine-grained level (per function even).

vemv commented 3 years ago

Well I'll leave this one to @lread however this thing you wrote here sums up well a common sentiment:

You wrote fdefs. But how are you going to test them?

i.e. instrumentation is bit of a frustrating experience, that can be tracked in various places (for example the Expound issue tracker comes to mind; integrating it with Orchestra is not trivial).

Used expertly, sure it should work. I'd simply be pessimistic about edge cases i.e. rewrite-clj can be used in unexpected ways (e.g. inlined via mranderson) and composability (namely: how different rewrite-clj consumers instrument rewrite-clj concurrently)

Whereas a custom check macro can be so simply reasoned about.

borkdude commented 3 years ago

The respeced library aids in (unit) testing your fdefs. It was specifically written to verify if the fdefs written in speculative were correct: will they throw on certain function calls, and will they allow other function calls? This is one level of testing higher than what we're talking about here.

I'm not necessarily a fan of everything spec does and looking forward to spec2, but meanwhile providing a foo.specs is a pretty standard / built-in way to solve this problem in Clojure libraries in 2021. E.g.: https://github.com/schmee/java-http-clj/blob/master/src/java_http_clj/specs.clj

Having said this, some custom code that doesn't incur any performance penalty when you won't want validation, could work.

vemv commented 3 years ago

I'm aware of the pattern however I'd be hesitant about instrumentation. Isn't it essentially a global side-effect? Whereas a binding would allow a per-consumer setting.

e.g. for the refactor-nrepl use case, we can bind e.g. [rewrite-clj/*assert* true] once at the middleware level (i.e. at the 'entry point' of execution, early and up) and rest assured that:

all refactor-nrepl internal defns are being invoked under said binding
we're not messing with libraries' or the user's preferences

If the exact same thing can be said of instrumentation, so much the better, LMK

borkdude commented 3 years ago

For me personally these extra checks are only useful during development and even then, mostly in development of rewrite-clj itself. Why is it important to turn this on in production code of refactor-nrepl? What action can the user take when something turns out to be wrong in the way that refactor-nrepl is using rewrite-clj? Moreover, I think these checks exist mostly as a development aid of rewrite-clj itself, since node construction isn't usually done "manually"?

vemv commented 3 years ago

:pres (or what have you) tremendously aide time-to-fix:

I don't have to ask too much to end users about the stacktrace they got and how they got it
- i.e. they got an informative one
I get fail-fast behavior
- i.e. I'm not inspecting rewrite-clj code deep in its stack, a :pre tells me exactly where things went wrong
I can create https://github.com/clj-commons/rewrite-clj/issues tickets more effectively, giving more info to the authors.

As mentioned, for refactor-nrepl the bottleneck is tools.analyzer so running these checks is affordable.

borkdude commented 3 years ago

@vemv

*assert* is enabled by default. People usually don't turn this off unless for performance reasons in prod.
the stacktrace you referred to happened because *assert* was enabled.
people usually use refactor-nrepl in development

Doesn't this all mean that everything already works quite the way it should for refactor-nrepl, as it is now? And if people did disable *assert* you could probably ask them to enable it to provide a better error message?

vemv commented 3 years ago

Things aren't working nicely right now because:

:pre is used without any extra mechanism informing of the value at fault
fixing that brings in the problem of how to provide good error reports optionally, without affecting other consumers.

the stacktrace you referred to happened because *assert* was enabled.

This is misleading, I would have gotten a stacktrace either way. :pre simply fails-fast vs a predictably-failing situation (which would have made me waste time squinting at it, and reporting a vague issue here)

borkdude commented 3 years ago

Yeah, I meant: the stack trace like you received it, with the assert failed message.

borkdude commented 3 years ago

Btw, I can do some perf tests with clj-kondo to see if lint time drastically reduces if I re-enable assert. Perhaps it doesn't make a huge difference. Perhaps it was overkill to disable it. Note: I've only disabled it in the native image build, not globally for library users, that would be a mistake.

vemv commented 3 years ago

tbh I don't feel comfortable either if making a 'client' happy will make others less happy i.e. I'm aware clj-kondo is editor-oriented so it makes sense for it to squeeze perf.

A POC would have to be 'non-breaking' in this regard

borkdude commented 3 years ago

@vemv I guess you can also demand/insist that users enable *assert* when they post error reports. I would be surprised if you would get a report in a refactor-nrepl error report where this wasn't already the case though.

vemv commented 3 years ago

Agreed. It's only part of the equation though

Summarizing my thinking, I'd propose a macro like this:

(def ^:dynamic *check* true)

(defmacro check [[f & args :as call]]
  {:pre [(symbol? f)
         (ns-resolve *ns* &env f)
         (not (-> (ns-resolve *ns* &env f) meta :macro)) ;; ensure `apply` will work
         (list? call)]}
  (when *assert*
    `(when *check*
       (let [as# ~(vec args)]
         (when-not (apply ~f as#)
           (throw (ex-info "Expectation failed"
                           {:call (apply list ~(list 'quote f) as#)})))))))

(defn foo [a]
  (check (pos? a))
  (+ a a))

(foo 1) ;; OK

(foo -1) ;; Fails and tells you that -1 was the value at fault

(binding [*check* false]
  (foo -1)) ;; -2

clojure.core/*assert* keeps working so clj-kondo has to change nothing in its uberjar
different, concurrent consumers can use their prefered setting (via binding) without stepping on each other or with the user's intent
- this is helpful when using clj-kondo via a vanilla JVM, if we want to speed up its execution
A (very) hypothetical user can set clojure.core/*assert* false and that would be honored
- I don't expect that to happen / I expect such users to be knowledgeable enough to also repro a given bug with *assert* true

borkdude commented 3 years ago

I think given the last:

A (very) hypothetical user can set clojure.core/assert false and that would be honored I don't expect that to happen / I expect such users to be knowledgeable enough to also repro a given bug with assert true

that doing nothing is perhaps still a valid option and leave the code as is. If people disable assert, it's on them to enable it for better error messages.

As I said, I might do some perf checks in clj-kondo to verify if disabling these checks was really worth it, I might not have done it with good reasons.

borkdude commented 3 years ago

Also I don't think introducing dynamic var derefs at runtime is an improvement (in terms of performance) over what we have right now. Then again, this may not be so critical as when you're developing a library like malli or so.

borkdude commented 3 years ago

Data to back up my opinion:

user=> (def ^:dynamic *check* false)
#'user/*check*
user=> (time (binding [*check* true] (dotimes [i 1000000] (when *check* (qualified-symbol? 'dude))))
)
"Elapsed time: 66.994102 msecs"
nil
user=> (time (binding [*check* true] (dotimes [i 1000000] (when true (qualified-symbol? 'dude))))
)
"Elapsed time: 9.295819 msecs"

The deref of the dynamic var itself is way more expensive than executing the actual predicate.

lread commented 3 years ago

Wow, I never expected this issue to generate such a full and interesting discussion!

vemv commented 3 years ago

Worth emphasizing the compile-time when *assert* before the runtime when *check*. For the most perf-sensitive use case (graalvm clj-kondo), the penalty would not be hit at all.

Still, it's a good observation and perhaps we can use an alternative. InheritableThreadLocals come to mind.

vemv commented 3 years ago

that doing nothing is perhaps still a valid option and leave the code as is. If people disable assert, it's on them to enable it for better error messages.

Remember that vanilla :pre doesn't inform of the value at fault, which can very plausibly improve time-to-fix.

borkdude commented 3 years ago

I don't want to incur any extra perf hits when I decide to re-enable *assert* preferably.

Some data with assert enabled (./clj-kondo) and disabled (clj-kondo).

$ time ./clj-kondo --lint $(clojure -Spath) > /tmp/output.txt
./clj-kondo --lint $(clojure -Spath) > /tmp/output.txt   4.89s  user 0.48s system 97% cpu 5.488 total
$ time clj-kondo --lint $(clojure -Spath) > /tmp/output.txt
clj-kondo --lint $(clojure -Spath) > /tmp/output.txt   4.89s  user 0.48s system 96% cpu 5.576 total

Practically no difference, so I'm probably just removing the assert false in my native-image! I probably just disabled it in the hope it would be faster, but it's not really noticeable and I did so without good reason.

So, I propose either one of these:

Keep using assert at compile time. People who disable it, know what they are doing and can turn it back on for better error reports.
Make checks non-optional by not wrapping them in assert at all

Since clj-kondo won't be disabling *assert* anymore, I also propose not adding any runtime penalties.

borkdude commented 3 years ago

https://github.com/clj-kondo/clj-kondo/commit/014aa3c5c2f4fb11af9313be3dd83d1f037b7d50

vemv commented 3 years ago

I disagree with your reasoning, which can be summarized as:

I don't perceive a performance hit with *assert* now
- i.e. you're not considering that third-party code is always subject to change
- specifically, it can add more, or heavier asserts.
because I enabled *assert*, now things have to be done in a way that doesn't impact me.

It's kind of a circular reasoning.

https://github.com/clj-kondo/clj-kondo/commit/014aa3c5c2f4fb11af9313be3dd83d1f037b7d50 introduces a design constraint almost on purpose. You are on time to undo that 🍻

borkdude commented 3 years ago

I think most people use this library with the default which is *assert* true. If this library would make drastic performance regressions (which have to be measured from release to release) then this would impact all default usage and would likely not go unnoticed in projects like clojure-lsp (@ericdallo) which I don't think have disabled asserts.

If that situation would arise (hypothetically) then would perhaps be a good time to review what should be done, but that's not the case right now as far as I know.

So I think we're discussing only a hypothetical problem and not a real pressing issue.

ericdallo commented 3 years ago

clojure-lsp certainly doesn't disable *assert*, I didn't even know about that var until now 😅 Can that cause performance issues?

lread commented 3 years ago

Maybe time to recap and summarize what actual problem we are trying to solve for users of the rewrite-clj library?

@vemv I think your original idea was to include more context so that when assertions were triggered they could be more easily diagnosed?

An important note: The rewrite-clj assertions we are talking about were carried forward from rewrite-clj v0. I have done no review of them and not added in any more (so coverage may be quite inconsistent).

vemv commented 3 years ago

@borkdude

Precisely in face of that type of considerations I've repeatedly hinted how it is important to decouple one's would-be*assert* from clojure.core/*assert*

Going back to this snippet https://github.com/clj-commons/rewrite-clj/issues/95#issuecomment-922345109 , we can change the default of *check* to false.

It would be set to true in rewrite-clj's test suite, and for any consumer that desired these checks. That allows to add more checks with full confidence that performance won't regress.

(the original :pres would be 'lost' for third-party consumers by default. It's a tradeoff to consider)

borkdude commented 3 years ago

I think I would rather have non-optional cheap asserts than adding a dynamic var deref around it which makes it 6x slower.

vemv commented 3 years ago

I think I would rather have non-optional cheap asserts than adding a dynamic var deref around it which makes it 6x slower.

I wrote:

Still, it's a good observation and perhaps we can use an alternative. InheritableThreadLocals come to mind.

Please let's not talk over each other.

vemv commented 3 years ago

Maybe time to recap and summarize what actual problem we are trying to solve for users of the rewrite-clj library?

:pres are a good thing, could be used more and in all cases include context. This way I can receive, and forward accurate error reports, which save debugging time for everyone involved.

Achieving so should not affect performance by default or introduce breakage, complexities etc.

clj-commons / rewrite-clj

Review usage of assert #95