emacs-elsa / Elsa

Emacs Lisp Static Analyzer and gradual type system.
GNU General Public License v3.0
640 stars 26 forks source link

Add check for assignment to defconst #152

Closed fmdkdd closed 5 years ago

fmdkdd commented 5 years ago

Hey there! Nice little project. I have been dabbling with a type checker for Elisp myself, but it looks like it may be more useful to contribute to this one instead. I'll be overjoyed when it typechecks flycheck.el, so I can sleep more easily at night ;)

To get a feel for it, I've tackled #150.

In foo.el:

(defconst foo 1)
(setq foo 1)

cask exec elsa foo.el:

analyzer: updating variable foo, old type nil
foo.el:2:6:error:Variable foo expects Const 1, got Const 1
foo.el:2:6:warning:Assignment to defconst foo.

Note that the stray type error. To quell it, I think we need to modify elsa--analyse:setq and move the check for defconst assignment there. Currently I've added a rule to the 'variables' ruleset. Which way do you think is best?

Fuco1 commented 5 years ago

Actually the definition of the type relation should make Const 1 accept exactly just Const 1:

(cl-defmethod elsa-type-accept ((this elsa-const-type) other)
  (and (elsa-const-type-p other)
       (elsa-type-equivalent-p (oref this type) (oref other type))
       (equal (oref this value) (oref other value))))

There's probably a bug someplace else.

As for the defconst, the Const type is more general than just defconst. (let ((a 1))) will also type a as Const 1 although this might be too agressive (i.e. a might be an iteration variable so we want to permit Int).

In fact (defvar foo 1) and (setq foo 1) (where foo is undeclared) and also (defconst foo 1) are all analyzed the same way: in other words any constant literal in the code is typed as Const.

There are two options I can see:

  1. store a flag on the name as we have with type ((get name 'elsa-type-var)) but something like elsa-is-const... We should also write some API above the get and put as that can get tricky to handle later.
  2. make only defconst use the Const type and in other places of assignment use the type of the constant expression instead, so it allows for assignment of the same type. This is probably what we want 99% of the time and in the remaining case users can always typehint the property as constant.

I think option 2 is better but requires more work... basically the assignment analysis everywhere needs to be changed (probably can be abstracted into some helper function).

In the future I'd also like to have something akin to javascripts const var = ... which would allow write-once-only initialization in let forms for example.

Fuco1 commented 5 years ago

Just to elaborate a little bit, Const type is not the same as readonly variable. We can have a type Const foo | Const bar which allows only the symbols 'foo and 'bar to be used as values. But we can freely assign those two to a varible with this type.

So better than the abovementioned const would be typescript's readonly on properties. A readonly (Const foo | Const bar) can be assigned 'foo or 'bar but only once.

fmdkdd commented 5 years ago

If I understand correctly, you want #150 to be resolved via the type checker rather than as a separate rule (as I did).

That's probably a better idea in the long run, as individual rules are harder to maintain than a generic type checking algorithm. The type checker can also take care of more situations than a single rule can.

the definition of the type relation should make Const 1 accept exactly just Const 1:

I was suprised by this as well, I'll try to investigate. But even if Const 1 accepts Const 1, we still want to warn on assignment to a defconst. Otherwise, we should use a defvar. Is this right?

In that case, the type of foo in (defconst foo) should indeed be annotated with a flag indicating that no assignment can take place. Then we can reuse that flag for other immutable constructs (like a const var).

As for the defconst, the Const type is more general than just defconst.

Yes, I saw that in the code. My understanding is that Const type is akin to value types, so it does not mean constant, but rather it's the most precise type you can assign to that value. Though as you note, it may not be helpful to always infer a Const type:

(let ((a 1))) will also type a as Const 1 although this might be too agressive (i.e. a might be an iteration variable so we want to permit Int).

I think value types are useful when used as type annotations. So the user can voluntarily restrict a type to discrete values. When inferring it may cause more trouble than worth. It seems we are in agreement here, based on:

This is probably what we want 99% of the time and in the remaining case users can always typehint the property as constant.

On that note, I haven't groked the full type algebra you are going for though. Is there a document for that somewhere? Or is it implicitly defined by the test suite?

A readonly (Const foo | Const bar) can be assigned 'foo or 'bar but only once.

Why not make it mean "never write to it" instead? When typechcking (defconst foo V), you assign the type readonly Const V and you are done. Even if defconst is not a special form, we can special case it (and we are already doing so, just delegating the analysis to defvar). Why would you want one assignment? Is it because the name definition and assignment happen at different times in the analysis?

readonly can also be made available as an annotation, though that's a separate concern.

Fuco1 commented 5 years ago

I was suprised by this as well, I'll try to investigate. But even if Const 1 accepts Const 1, we still want to warn on assignment to a defconst. Otherwise, we should use a defvar. Is this right?

Yep, which is why I think we actually can't solve this on the type level and the rule you wrote, or a similar one, will be necessary.

In that case, the type of foo in (defconst foo) should indeed be annotated with a flag indicating that no assignment can take place. Then we can reuse that flag for other immutable constructs (like a const var).

Yes exactly. I think a readonly flag for assignments will be useful in the future so I'd probably go with that.

My understanding is that Const type is akin to value types

Maybe we should rename it to value types. That makes equal (or more) sense and removes the const/readonly ambiguity. There's a #74 proposal so we can do that along with this change.

you assign the type readonly Const V and you are done

I haven't thought of it this way! We could probably indeed bake the flag into the type itself. So the readonly decorator would simply break typecheck when accepting anything, while when being the asigned value we would only look at the wrapped type. I like this.

I wanted to allow one assignment because yes, in theory you can have initialization later, but this probably makes no sense. ISTR there was some language where this made sense, but probably porting it to elisp is non-sensical. I guess what I ment was simply that (x marked readonly) (setq x 1) is one assignment and the only one we allow. So internally it maybe makes more sense to track the number of assignments... which I want to do anyway to detect unassigned/unused variables (#39, #82)

Fuco1 commented 5 years ago

As for the type algebra, the WIP document is here: https://github.com/emacs-elsa/Elsa/pull/96 I'm still slowly finalizing it. I'm mostly modelling this with typescript/flow in mind as JS is surprisingly similar to lisp when it comes to being dynamic language with "random objects" (i.e. lists in elisp and objects in javascript) so many problems are basically 1:1 parallels.

fmdkdd commented 5 years ago

ISTR there was some language where this made sense, but probably porting it to elisp is non-sensical.

In Rust (and I guess TypeScript) it makes sense to do:

let a;
if (...) { a = ... } else { a = ... }  // OK
a = 2; // FAIL

where a is in fact immutable, so the typechecker ensures there is only ever one assignment to it, even though it can happen after declaration.

I can't think of an example in Elisp where you would need that though.

As for the type algebra, the WIP document is here: #96. I'm mostly modelling this with typescript/flow

Great, I'll definitely take a look. I agree that TS/Flow are good systems to take from, since Elisp has the same "anything goes" vibe. TS also makes a good argument that unsound type systems can still be useful. I have taken a deep look at the Flow paper a few months back and implemented the type system in Rust, but I was not particularly enlightened by the approach.

So internally it maybe makes more sense to track the number of assignments... which I want to do anyway to detect unassigned/unused variables

Indeed, we may reuse that information if we want to have it anyway. Both ways are reasonable; what is more practical for the implementation might win.

fmdkdd commented 5 years ago

I've updated the PR:

  1. Add the elsa-readonly-type, which can wrap any type.
  2. Change elsa--analyse:defconst to return a Readonly type around the result of elsa--analyse:defvar.
  3. Change elsa--analyse:setq to error on assignment to Readonly type.
  4. Remove the check-defconst-assignment rule, as it was redundant.

I've updated the failing tests, and added a new one. Note that for one test I had to use :to-equal for the types instead of :to-be-type-equivalent, because elsa-type-accept right now will always return nil for Readonly types, since we can never assign to them.

I'm a bit surprised that an explicit type annotation will totally override the inference done by the analysis. I guess it's helpful while things are still in flux, but on the long run we may want to check that annotations are at least somewhat compatible with the inferred type. Otherwise, explicit annotations may be an easy way to introduce bugs. Here for instance, annotating as (foo :: Bool) will dispose of the Readonly type, and thus the defconst can be assigned to without warnings.

Fuco1 commented 5 years ago

I'm a bit surprised that an explicit type annotation will totally override the inference done by the analysis. I guess it's helpful while things are still in flux,

Hm, I partially thought of the annotations as backdoor to help fix "crap" while the type system is properly finished and most of all until we have all the internal/builtin functions annotated (which number in thousands) without which the type inference is quite bad.

As for the defconst I'd be perfectly fine with reading the annotation, whatever it is, and wrapping it in a readonly.

We will also need to add a rule that allows assiging readonly types, I don't see that (but I'm also really tired :D).

I.e.

(defconst foo "string")

;; (bar :: String)
(defvar bar)

(setq bar foo)  ;; should type check "assignment of Readonly Const String to String", is good.
fmdkdd commented 5 years ago

We will also need to add a rule that allows assiging readonly types, I don't see that (but I'm also really tired :D).

Indeed! I added it in the top definition of elsa-type-accept, like how const is handled. At first I tried to use multiple dispatch with (cl-defmethod elsa-type-accept ((this elsa-type) (other elsa-readonly-type)) but it didn't go through. Any idea why? (Not that I think it's necessarily more maintainable... multiple dispatch can be tricky to debug).

Fuco1 commented 5 years ago

@fmdkdd It is actually pretty tricky when you do dispatch on more than the first argument, especially if there are also hierarchies of classes. I still think there's a couple bugs related to this. As for when I do it with the signatures and when I "dispatch" manually, it is somewhat emergent :D

I think but I'm not 100% sure that EIEIO matches the arguments in order, so if it ever specifies on the first argument it does not backtrack. Which might explain why your code didn't ever execute: it probably specified on something more specific than elsa-type in the first argument.

Fuco1 commented 5 years ago

I think this is good to go, if you don't have any outstanding ideas we can go ahead and merge this.

fmdkdd commented 5 years ago

so if it ever specifies on the first argument it does not backtrack.

I should just have RTFM, because apparently multiple dispatch is not supported:

Method dispatch EIEO does not support method dispatch for built-in types and multiple arguments types. In other words, method dispatch only looks at the first argument, and this one must be an EIEIO type.

I think this is good to go, if you don't have any outstanding ideas we can go ahead and merge this.

Sure, it's a good start! Looking forward to bang on the type system further, as I find the time. Thanks for taking the quick and thorough feedback.

Fuco1 commented 5 years ago

Method dispatch EIEO does not support method dispatch for built-in types and multiple arguments types. In other words, method dispatch only looks at the first argument, and this one must be an EIEIO type.

This seems highly improbable because at some places I rely on multiple dispatch. I'm so confused now. (look into the type helper.el file for the sum/diff combinators)