emacs-elsa / Elsa

Emacs Lisp Static Analyzer and gradual type system.
GNU General Public License v3.0
645 stars 27 forks source link

Lispify and de-Haskellize annotations? #74

Closed alphapapa closed 2 years ago

alphapapa commented 6 years ago

This is an exciting project!

It would be even more exciting if it didn't use Haskellisms in its annotation syntax. :) For example, instead of:

;; (elsa--arglist-to-arity :: List Symbol -> Cons Int (Int | Symbol))

It could be:

;; (elsa--arglist-to-arity ((list symbol)) (cons int (or int symbol)))

Some other examples:

;; (elsa-pluralize :: String -> Int -> String)
;; becomes
;; (elsa-pluralize (string int) (string))

;; (elsa--forward-sexp :: Int? -> Int)
;; becomes
;; (elsa--forward-sexp (&optional int) (int))
;; or
;; (elsa--forward-sexp (or int nil) (int))

I realize this would be a relatively major change, but I think it would be worth it, because it feels very strange (and completely unintuitive for a non-Haskeller) to use Haskell type annotation syntax for Emacs Lisp.

Fuco1 commented 6 years ago

Sexp annotations are really really ugly. We can support both syntaxes but I don't want to use it. I've considered it and decided against because I found it unreadable. This is heavily a matter of preference. I'm not saying no but we need broader discussion.

If we decide to change stuff it is not too late and most of the changes can be done with a simple macro so it's not that terrible.

Some questions regarding your proposal:

;; (elsa-pluralize (string int) (string))

Why is the last string in parens? One issue here I have is that parens means lists and people might confuse (string) with a list of strings as opposed to just a string. Saying List String makes it clear, though I have a shorthand sytnax [String] also.

;; (elsa--forward-sexp (&optional int) (int))

What if this function had two arguments, would I write (int (&optional int)) or (int &optional int)? I guess the latter.

Rewriting some more arcane types like

(elsa-make-type String -> String -> (String -> Bool) | Nil -> String | Nil | T)

yields

(string string (or (function string bool) nil)) (or string nil t)
(string string (&optional (function string bool))) (or string nil t)

assuming we use function to mark a function type. I find that quite unreadable.

How would you deal with generic types, such as a -> a -> [a] where a can be substituted for anything. Haskell deals with this using Upper = concrete type and lower = generic.

There needs to be some marker that the comment is an annotation. Right now it's the ::. We need to have something like that or put something in front of the list, maybe (&type elsa-pluralize ...).

Fuco1 commented 6 years ago

If you look into elsa-typed-builtin.el we also need to support this syntax in the elsa-make-type macro. There it does not look like an argument list and body because there is no function name, it is a stand-alone type. So the split in two lists is also a bit questionable, i.e. (elsa-make-type (int) int) makes a unary function taking int returning int.

Fuco1 commented 6 years ago

We will also need record types such as plists or alists or tuples, in Haskell this is done using a record syntax

data Person = Person { name :: String, age :: Int }

In Elsa my plan was not to have them named so you would either use Plist or Alist constructor with the record key :: type. Here it gets quite unlispy so I haven't implemented that yet. For example

Plist { :beg :: Int, :end :: Int, :op :: String, :cl :: String, :prefix :: String, :suffix :: String }

would be the structure of smartparens' expression. That looks weird!

alphapapa commented 6 years ago

Sexp annotations are really really ugly.

Sexps are ugly? =) Seriously, though, I think it's not easy to make type annotations nice, but I find the Haskell annotations very ugly, myself. The use of the same -> symbol to separate both each argument and the return type implies visually that they are semantically the same, when they are completely different things. It would seem sensible to use it to separate the arguments from the return type, because the arrow implies that the arguments lead to the return type, but each argument does not lead to the next argument, so an arrow makes no sense between them.

We can support both syntaxes but I don't want to use it. I've considered it and decided against because I found it unreadable. This is heavily a matter of preference. I'm not saying no but we need broader discussion.

Thanks for considering this idea. I don't think sexp annotations must be unreadable, any more than lisp function signatures or -let patterns must be.

Why is the last string in parens?

That was my mistake. Let me correct it:

;; (elsa-pluralize (string int) string)

What if this function had two arguments, would I write (int (&optional int)) or (int &optional int)? I guess the latter.

I'd be okay with either, as long as it's consistent and works. On one hand, staying as close to CL function argument syntax seems like a good idea, but on the other hand, if it can't be the same, maybe it would be better to be distinctly different. Perhaps a judgment call to make after some experimenting.

Rewriting some more arcane types like

(elsa-make-type String -> String -> (String -> Bool) | Nil -> String | Nil | T)

yields

(string string (or (function string bool) nil)) (or string nil t)
(string string (&optional (function string bool))) (or string nil t)

assuming we use function to mark a function type. I find that quite unreadable.

Well, IMHO the parens help group alternatives together, while the -> obscures relationships. For example, looking at | Nil -> String | Nil, I know that the -> is the divider between arguments, but to me the | typically represents boolean OR, so the natural grouping that stands out to me is Nil -> String, which in fact crosses a boundary.

IMHO, I think this is no less readable than a pcase or -let pattern, or a CL function signature:

;; (string string (&optional (function string bool))) (or string nil t)

And, in fact, the return type is much more visually distinguished, because it's not simply the final item in a list of type -> type -> type -> type. And that distinction could be further enhanced by using a symbol between the argument list and return type: while not strictly necessary or "lispy," it might be good to help the visual distinction, like:

;; (string string (&optional (function string bool))) -> (or string nil t)

or:

;; (string string (&optional (function string bool))) :: (or string nil t)

or

;; (string string (&optional (function string bool))) &return (or string nil t)

etc.

How would you deal with generic types, such as a -> a -> [a] where a can be substituted for anything. Haskell deals with this using Upper = concrete type and lower = generic.

Personally, I think ELSA annotations should avoid being case-sensitive if possible. Although it could be useful, it's very "unlispy," and I feel like it mentally "clashes." I'd rather see a few unique symbols used instead, like ->, ::, or words prefixed with &. So maybe instead of a -> a -> [a], we could use something like:

;; (&generic &generic) (list &generic)

It's wordy, but lispy and unambiguous. Or if you wanted to preserve the ability to use arbitrary strings for generic arguments, maybe a different prefix character could be used, so you could write something like:

;; ($a $a) (list $a)

Or if that offends your Haskell sensibilities too much, maybe:

;; (%a %a) (list %a)

Or even:

;; (*a *a) (list *a)

Although that might look too much like C, haha.

There needs to be some marker that the comment is an annotation. Right now it's the ::. We need to have something like that or put something in front of the list, maybe (&type elsa-pluralize ...).

It might be good to use a to-do-style keyword, like:

;; ELSA: (elsa-pluralize (string int) string)

Even better would be if it could be inside the function definition rather than before it, so instead of:

;; (elsa-pluralize :: String -> Int -> String)
(defun elsa-pluralize (word n)
  "Return singular or plural of WORD based on N."
  (if (= n 1)
      word
    (concat word "s")))

It could be something like:

(defun elsa-pluralize (word n)
  ;; ELSA: (string int) string
  "Return singular or plural of WORD based on N."
  (if (= n 1)
      word
    (concat word "s")))

I think that's actually very important, because otherwise ELSA type annotations will not be easily moved with function definitions. For example, it's easy to use expand-region or forward-sexp to select entire functions, kill, and yank them--or to use e.g. lispy-move-down with point on a function to rapidly reorder code. With type annotations being outside the defun form, that won't work at all, and users will have to manually fix it.

If you look into elsa-typed-builtin.el we also need to support this syntax in the elsa-make-type macro. There it does not look like an argument list and body because there is no function name, it is a stand-alone type. So the split in two lists is also a bit questionable, i.e. (elsa-make-type (int) int) makes a unary function taking int returning int.

Sorry, I haven't read through very much of ELSA's code yet, so I don't know what you are talking about. Hopefully it could be made to work with syntax like this. :)

In Elsa my plan was not to have them named so you would either use Plist or Alist constructor with the record key :: type. Here it gets quite unlispy so I haven't implemented that yet. For example

Plist { :beg :: Int, :end :: Int, :op :: String, :cl :: String, :prefix :: String, :suffix :: String }

would be the structure of smartparens' expression. That looks weird!

Well, you've surely thought this through much more than I have, so maybe I'm just being noisy here, but is it necessary to specify the constructor and list type? Using that example, what about:

;; ELSA: (:beg int :end int :op string :cl string :prefix string :suffix string)

Doesn't that provide essentially the same information, without all the ::s and commas? The fact that it's a plist can be inferred, or perhaps be made irrelevant by using pattern-matching (at least, theoretically; I'm sure that you, having actually written the code, may know otherwise).

The problem with my giving all this feedback is that I haven't written the code, and you have, so I don't know much about the feasibility of implementing these ideas. So my feedback as an Emacs package developer is that I am not at all enthusiastic about having to learn Haskell-style type annotations and sprinkle them all over my code. :) And I think that most other Emacs users would feel the same way, so if ELSA required it, I think it could be a significant barrier to entry, which would be a shame for such an interesting and helpful package.

My two cents. Thanks for your work.

Fuco1 commented 6 years ago

The use of the same -> symbol to separate both each argument and the return type implies visually that they are semantically the same, when they are completely different things.

Actually, they are the same in Haskell, that's why the arrow makes sense. Functions always return functions, even 1 :: Int is a function (constant one). The arrow always only has two arguments, left and right, the reason you can write a -> b -> c is because of the implicit precedence. It is in fact a -> (b -> c). In other words, a comes in, (b -> c) comes out.

Or if you wanted to preserve the ability to use arbitrary strings for generic arguments

Not so much what I want but we actually must be able to do that. We can have a function (a -> b) -> (b -> c) -> (a -> c) (this would be function composition or "piping" combinator, so we need to be able to use different type variables.

;; (a a) (list *a)

I think I like this the best, but maybe I would move the star to the end, so

;; (a* a*) (list a*)

Something like that might work.

I think that's actually very important, because otherwise ELSA type annotations will not be easily moved with function definitions.

This is a very good point and in fact previously I wanted to use the declare form to add types, so you would write

(defun elsa-pluralize (word n)
  "Return singular or plural of WORD based on N."
  (declare (elsa-type (string int) string))
  (if (= n 1)
      word
    (concat word "s")))

I had some problem with defun-declarations-alist not having the elsa-type entry and either byte compiler or edebug or something else complained about it. I might look into it again. We must make sure that you can work on your code and compile it without Elsa as a dependency.

;; ELSA: (:beg int :end int :op string :cl string :prefix string :suffix string)

Doesn't that provide essentially the same information, without all the ::s and commas? The fact that it's a plist can be inferred, or perhaps be made irrelevant by using pattern-matching (at least, theoretically; I'm sure that you, having actually written the code, may know otherwise).

Yes that's pretty good. We still need to have the type constructor though because it might be a hash table or alist or some other map type.

So my feedback as an Emacs package developer is that I am not at all enthusiastic about having to learn Haskell-style type annotations and sprinkle them all over my code. :) And I think that most other Emacs users would feel the same way, so if ELSA required it, I think it could be a significant barrier to entry, which would be a shame for such an interesting and helpful package.

Good point, I like the feedback, please let it coming. I am very biased towards Haskell and sometimes blinded by my love, as men can often get ;) I think this is probably the most convincing argument and it would be quite stupid of me to block adoption because of my biases.


We still need to figure out how to do things like vectors, eieio objects, cl defstructs... I would like to have a shorthand notation for lists because they are really common, so maybe just (a)... but we also need a way to group things (constructors and arguments) which would again be with parens and so there it breaks down because in principle we can't know if a isn't a constructor or a user-defined type. So maybe a constructor for everything is necessary... this also goes back to your suggestion about plists.

One trouble the prefix notation saves us is the operator precedence and constructor nesting, so that (foo ....) is always foo constructor. I quite liked the possibility of omitting the parens (i.e. you can write Cons a b and the precedence takes care of that) but then sometimes you would need to wrap an argument if it is a complex type, such as Cons a (b | c)... which in lisp-like notation would always be unambiguous (cons a (or b c)).

I would like some other people to join in but don't know who to invite. @wilfred @wasamasa mabye you two have something to add, since you've already commented on the project.

Fuco1 commented 6 years ago

Code-wise, if you look in elsa-type-helpers.el the entire magic is handled by elsa--make-type which is quite a simple function just dispatching the constructors... so rewriting it for another syntax is maybe hour or two of work.

Bigger problem would be to go over the code and change all the calls, but a macro or two should do. It's still early so a change like this is possible no problem.

Fuco1 commented 6 years ago

@alphapapa How do you feel about the ? suffix. That's an extension to the Haskell syntax since haskell does not have nullable types, you have to express that using a Maybe constructor, so Maybe Int I write here as Int? because, well... Emacs is full of nil and most everything is a "Maybe"

Also a "nullable" argument is not the same as optional one. We can have a defun

(defun foo (a &optional b) ...)

where a can be a string or nil but must be supplied whereas b might be omited, so a type like

(&optional string int)

would in fact be incorrect.

We can therefore write

((or string nil) &optional int)

or

(string? &optional int)

I'm not sure which one is better.

wasamasa commented 6 years ago

I've got to admit that I find Haskell type signatures close to unreadable. When I wrote some SML for fun, I relied a lot on a REPL to figure out the correct types, this wouldn't be an option with Elsa I'm afraid. Currying is something you shouldn't need to know to read a type signature, I'd rather be explicit in delimiting each logical unit to avoid mistakes. S-Expressions do this already, but we could do with more syntactical sugar...

To throw in another candidate regarding type syntax, how about the ALGOL family? Modula-3 requires specifying type signatures in modules which I find quite readable (once you get over the shock of them using allcaps for keywords): https://github.com/wasamasa/mal-candidates/blob/master/modula-3/src/Util.i3. A more modern source of inspiration are the optional type systems, as seen in Clojure's spec (not so much my thing because you need to look up every user-defined spec to make heads of it) or the JS stuff for React (like PropTypes and Flow).

Fuco1 commented 6 years ago

I like Flow but it also is quite Haskelly in some respects, like the use of | for sum and & for intersection (Haskell does not have Intersection types because those are impossible there) and use of parens for precedence.

I also don't think we should mix the argument names with types if it's not in-line... and I'm not sure I want to write a "babel" for Elisp to discard the annotations before use (and that would also break so much things like eval-defun that I'm not sure it's worth it).

Later I would like to support algebraic data types... defcustom comes closest to mind but its type annotations are also quite arcane. But maybe could be considered as an inspiration.

I also want to have constant types so we can say something like "this is a number or a symbol 'many", so in my old syntax that would be Number | 'many which we can rewrite as (or number 'many) which isn't so bad. Defcustom would have it a (choice number (const many)) which is also not that bad... because we have to consider the constant types might be other things than just symbols... but a 'quoted-symbol could be syntactic sugar for (const quoted-symbol). A string or number could stand for itself also.

wasamasa commented 6 years ago

I like Flow but it also is quite Haskelly in some respects

Yes, that's fine. Arriving somewhere between JS and Haskell is more than acceptable.

I also don't think we should mix the argument names with types if it's not in-line... and I'm not sure I want to write a "babel" for Elisp to discard the annotations before use

I agree, argument names in declarations should be optional, preprocessing is something that we don't need. A comment or form behaving like one is what I strive for.

Later I would like to support algebraic data types...

Maybe put those into a separate file where all common types are declared? Or into the same file if it's a one-file package?

Can't comment on the last one, the only association I have here is that some ML-like language didn't support figuring out a a composite type being equivalent to an alternative representation.

Fuco1 commented 6 years ago

@wasamasa Yea I'm shooting myself in the foot a bit now with types like bool which I take to be strictly t or nil and it's proving a bit annoying when doing arithmetics, i.e. bool - nil I need a special rule to resolve that to t, similarly number - int is float (not really making sense mathematically but the Emacs type system handles it as such).

At least for nullable types I just internally do a sum of type + nil... maybe I will switch all the others to this as well. That will require custom constructors for everything but I think it might be more acceptable in the long run. And then some clever print rules to convert t + nil into bool.

Wilfred commented 6 years ago

Sure, I'm happy to make a few suggestions if it helps :)

"Racket programs are not secretly Haskell programs." -- Sam Tobin-Hochstadt (regarding Typed Racket) -- https://twitter.com/_wilfredh/status/921781493655068672

Elisp functions have very different semantics from Haskell functions. The following three functions must have a different signature in Elsa:

(defun foo1 (x y)
  (+ x y))

(defun foo2 (x)
  (lambda (y) (+ x y)))

(defun foo3 (x-y)
  (+ (car x-y) (cdr x-y)))

In Haskell, foo1 and foo2 are indistinguishable. You can write foo3, but currying is more encouraged.

You also need to be able to distinguish the following:

(defun bar1 (x y)
  (when (null y)
    (setq y 0))

  (+ x y))

(defun bar2 (x &optional y)
  (when (null y)
    (setq y 0))

  (+ x y))

(bar1 0) is an error, but (bar2 0) is not. You therefore need to distinguish a required argument that can take the value of nil from an optional argument.

You're also making Elsa a ton easier for new users if you use existing elisp syntax as much as you can. Users already know what &optional means, for example.

It's true that -> is used in many languages for types, but you could still reuse function declaration syntax whilst preserving ->. For example, file-name-completion:

(elsa-make-type (String String &optional (String -> Bool)) -> String | Nil | T)

I'd also be tempted to use upper case for generics and lower case for known types. It's unusual to use upper case in elisp, and (type-of 1) returns integer in lower case. OCaml uses lower case for types too (and 'a for generics) so there's some precedent.

FWIW, Typed Racket (docs) goes for a full sexp syntax:

(: distance (-> pt pt Real))
(define (distance p1 p2)
  (sqrt (+ (sqr (- (pt-x p2) (pt-x p1)))
           (sqr (- (pt-y p2) (pt-y p1))))))

Infix syntax is fiddly. Have you worked out and documented the precedence of your operators? It's a ton easier to work with sexps in lisp -- you can build subexpressions much more easily.

My two cents, I hope that helps!

Fuco1 commented 6 years ago

As far as the precedence I'm using -> as the most loosely binding, followed by | followed by the type constructors.

Good points about currying, we've sort of bumped into it previously already with comments by @alphapapa and myself. The current setup actually can not distinguish the cases bar1 and bar2 you've mentioned on the type level, I compute arity separately so it's not encoded in the type but in the function signature itself. These should probably correspond for greater security (right now the correspondence is only loose with optional ~ nullable).

Also good points about generics. I would refrain from 'a as that looks like a quoted symbol which I would like to use for constant types as many things take special symbols such as 'append etc.

Typed racket looks interesting, that might be something to steal from.

alphapapa commented 6 years ago

Actually, they are the same in Haskell, that's why the arrow makes sense. Functions always return functions, even 1 :: Int is a function (constant one).

I'm not sure if one of us is misunderstanding. :) What I mean is, there is a basic difference in purpose and meaning between a) arguments in a function signature, and b) the function's return value. Logically, the arguments lead to the return value, so an arrow pointing from the arguments to the return value makes sense. But one argument to a function does not logically result in the next argument to the function, so an arrow between them makes no sense--there is no consequential relationship between them. I'm sure there's some kind of abstract mathematical theory in which it makes sense, but from a simple input->output perspective, it doesn't. (You could argue that it's actually "lispy", similar to how the first element of a sexp is the function, but then why all the ->? :) Anyway...)

Not so much what I want but we actually must be able to do that. We can have a function (a -> b) -> (b -> c) -> (a -> c) (this would be function composition or "piping" combinator, so we need to be able to use different type variables.

Hmm. Well, this is a bit of a digression, so feel free to redirect this query, but getting that complex makes me wonder: how useful will this be when the annotations are not guaranteed to match the code? Maybe I'm asking you to sell me on this a bit more. :)

I think I like this the best, but maybe I would move the star to the end, so

;; (a* a*) (list a*)

Something like that might work.

Yeah, that's pretty good.

previously I wanted to use the declare form to add types

Yeah, that would be ideal. I guessed that it wouldn't be so easy to do cleanly. We certainly don't want to add byte-compile warnings. I wonder if the declare form or defun-declarations-alist needs support for optional features which could be ignored without warnings if the supporting code isn't found. Maybe the Emacs devs would accept a patch for that?

We still need to have the type constructor though because it might be a hash table or alist or some other map type.

Well, if it's a type that has a readable representation, couldn't the constructor be omitted? e.g.:

;; plist
'(:keyword1 arg1 :keyword2 arg2)

;; alist
'((keyword1 . arg1) (keyword2 . arg2))

;; vector
[arg1 arg2]

For hash tables, do you intend to verify that they contain certain keys, values, or hash-table attributes?

I am very biased towards Haskell and sometimes blinded by my love, as men can often get ;)

Haha, and here I thought lisp was your first love.

We still need to figure out how to do things like vectors, eieio objects, cl defstructs... I would like to have a shorthand notation for lists because they are really common, so maybe just (a)... but we also need a way to group things (constructors and arguments) which would again be with parens and so there it breaks down because in principle we can't know if a isn't a constructor or a user-defined type. So maybe a constructor for everything is necessary... this also goes back to your suggestion about plists.

Personally, I don't think a shorthand notation for lists is necessary, and in fact it might be more confusing for users, just as it would be ambiguous for the code.

One trouble the prefix notation saves us is the operator precedence and constructor nesting, so that (foo ....) is always foo constructor.

I agree.

I quite liked the possibility of omitting the parens (i.e. you can write Cons a b and the precedence takes care of that) but then sometimes you would need to wrap an argument if it is a complex type, such as Cons a (b | c)... which in lisp-like notation would always be unambiguous (cons a (or b c)).

I feel like implied precedence like that has no place in a lisp. :) Or, at least, it's very jarring.

It's still early so a change like this is possible no problem.

:D

How do you feel about the ? suffix.

It feels unlispy to me. I think it would be good to avoid mixing language metaphors as much as possible. I think I'd prefer ((or string nil) &optional int).

I'm not sure I want to write a "babel" for Elisp to discard the annotations before use (and that would also break so much things like eval-defun that I'm not sure it's worth it).

I think it's definitely not worth it. I think a preprocessor is not the answer. :)

but a 'quoted-symbol could be syntactic sugar for (const quoted-symbol). A string or number could stand for itself also.

While I think shorthands should be very carefully considered, and maybe even generally avoided in these annotations (because I think the annotations should be unambiguous), I think that's very reasonable and consistent with other elisp.

Fuco1 commented 6 years ago

Hmm. Well, this is a bit of a digression, so feel free to redirect this query, but getting that complex makes me wonder: how useful will this be when the annotations are not guaranteed to match the code? Maybe I'm asking you to sell me on this a bit more. :)

Then you will get an error if you use it differently, which will trigger the update to either code or the annotation. It's like when you have unit tests and you change some code: either the tests must be updated or your code is wrong.

Every form has a type and we track it through the whole program so somewhere something will have to give (once all the rules of propagation are implemented, we're not there quite yet).

Logically, the arguments lead to the return value, so an arrow pointing from the arguments to the return value makes sense.

In Haskell every function only ever takes one argument, so the arrows can be put into parens to make this plain: a -> (b -> (c -> d))) is a funciton of "three" arguments but really it's a function of one argument a returning a function of one argument b returning function of one argument c returning d. In Math there's the concept of Cartesian Closed Category in which you can exchange pairs for exponentiation, in other words functions from a pair (a,b) -> c can be turned into functions from a -> (b -> c). Haskell can rely on this because the type system makes a CCC.

We omit the parens because they are annoying (in both Haskell and Math).

Well, if it's a type that has a readable representation, couldn't the constructor be omitted? e.g.:

We can not assume that plists have :keywords always as arguments. I also want to support tuples, that is lists of finite known size with prescribed positional arguments, so a (keyword int) might be a type of "list of two things where first is keyword and second is an int", then the syntax would be ambiguous. Also if I wrote (list int) would that be a list of ints or a plist with a key 'list and value of type int?

I think if we're going to have constructors then let's have them for everything. Plists and alists are only conventions on top of lists so also anything that accepts lists should also accept those.

alphapapa commented 6 years ago

Then you will get an error if you use it differently, which will trigger the update to either code or the annotation. It's like when you have unit tests and you change some code: either the tests must be updated or your code is wrong.

Every form has a type and we track it through the whole program so somewhere something will have to give (once all the rules of propagation are implemented, we're not there quite yet).

I think one of us is misunderstanding the other. What I mean is, as far as I understand, the point of ELSA (or at least, one of the main ones) is to catch type-related errors before runtime. It uses annotations to do so, since the code itself is not explicitly typed. But the annotations are written and updated manually, so just like comments and docstrings, they are not guaranteed to reflect the current state of the code. Therefore, even if ELSA detects no type errors, type-related runtime errors may still occur. Do I understand correctly?

In Haskell every function only ever takes one argument, so the arrows can be put into parens to make this plain...In Math there's the concept of Cartesian Closed Category...

Thanks, I figured there was math behind it. But my point is that, from a logical perspective, regardless of underlying implementation details which may entail a series of functions returning functions, a function is a matter of inputs (which are orthogonal to each other) and outputs. And especially in elisp...well, I'm sorry, but I just don't care about how Haskell does it, no matter how much you love Haskell. ;D Not to knock Haskell, it's very interesting--but then I found lisp... :)

We omit the parens because they are annoying (in both Haskell and Math).

I guess your next project will be to implement SFRI 119 in elisp. :)

We can not assume that plists have :keywords always as arguments.

Yes, of course, I should have thought of that.

I think if we're going to have constructors then let's have them for everything.

Agreed.

Fuco1 commented 6 years ago

Therefore, even if ELSA detects no type errors, type-related runtime errors may still occur. Do I understand correctly?

Once the system is fully implemented that will be impossible. Of course that will never happen in the presence of all the side-effects and macros and so on (you would have to write the code in very specific way, i.e. immutable everything). I'm actually tempted to introduce something like the IO monad to guard for side-effects but that would get so madly inconvenient in Elisp that I think it's not worth the trouble (and the adoption would suffer also). I might do it as an optional addon where you can enable side-effect tracking with some setting.

It is also very unlikely that you will write wrong annotation (although possible).

If you have code like

;; (add1 :: Int -> Int)
(defun add1 (x)
  (1+ x))

it works. Then you decide to make the function better by also accepting strings and automatically casting them to integers:

;; (add1 :: Int -> Int)
(defun add1 (x)
  (if (stringp x)
      (1+ (string-to-number x))
    (1+ x)))

You forgot to update the signature, but now ELSA would complain that (stringp x) never returns true because you are passing in Int. So you figure "A ha... I need to fix the signature" and rewrite it as

;; (add1 :: String | Int -> Int)
(defun add1 (x)
  (if (stringp x)
      (1+ (string-to-number x))
    (1+ x)))

You can still call it from somewhere else like this (add1 :foo) but there it would also complain about giving it a wrong argument. So in practice, unless the annotations are very generic like Mixed -> Mixed (i.e. anything goes) the code and types will "guard" each other.

Also if you write something like

;; (add1 :: Int -> String)
(defun add1 (x)
  (1+ x))

Elsa knows that 1+ returns numbers and will complain

Imgur

Fuco1 commented 6 years ago

I was thinking about the "declarations within function" problem and it might easily be solved like this:

(defun add1 (x)
  "Add 1 to X."
  [add1 :: (int) int]
  (1+ x))

So essentially we would put a static vector somewhere behind the docstring/interactive/declare specs... this should be byte-compiled into a constant so the runtime performance should not be very much hindered, but there is extra data in the memory... It's a very simple approach though, so that helps. Can easily be converted to some declare magic if I can figure it out.

alphapapa commented 6 years ago

ou forgot to update the signature, but now ELSA would complain that (stringp x) never returns true because you are passing in Int.

Oh, wow, okay. I didn't realize that ELSA worked that way. It's much more sophisticated than I realized.

You can still call it from somewhere else like this (add1 :foo) but there it would also complain about giving it a wrong argument.

Sorry, I don't understand. What does the italicized "it" refer to in that sentence? :)

So essentially we would put a static vector somewhere behind the docstring/interactive/declare specs... this should be byte-compiled into a constant so the runtime performance should not be very much hindered, but there is extra data in the memory... It's a very simple approach though, so that helps. Can easily be converted to some declare magic if I can figure it out.

That's an interesting idea. What's the advantage over using a comment? What do you mean by converting it to "declare magic"?

Also, what about doing it like this:

(defun add1 (x)
  "Add 1 to X."
  [elsa (int) int]
  (1+ x))

The repetition of the function name and superfluous :: irk me a bit. :) Plus, having elsa as the first element seems to make sense, as it explicitly marks it as for ELSA.

I don't mind a few extra bytes in memory, in general. However, if it will have any impact on runtime performance, I'd be against it, myself. Elisp is slow enough already, and I wouldn't want using ELSA to have any effect on runtime performance.

What if it were a macro which always evaluated to nil, so the byte-compiler could optimize it out? (Assuming it does that.) That would make parsing the code easier for ELSA (assuming that's the reason for not using a comment), while avoiding impact on the code at runtime (when byte-compiled, anyway).

Fuco1 commented 6 years ago

What if it were a macro which always evaluated to nil, so the byte-compiler could optimize it out? (Assuming it does that.) That would make parsing the code easier for ELSA (assuming that's the reason for not using a comment), while avoiding impact on the code at runtime (when byte-compiled, anyway).

This will make Elsa a dependency becasue the macro would have to be defined somewhere. I don't know about any macro that would just discard everything, ignore sadly is a function so there would even be an extra function call.

Actually, I've now tried this

(defun foo (x)
  (declare (asdasdasd 1))
  x)

(defmacro foobar (x)
  (declare (asdasdasd 1))
  x)

and both of those can be evaled and I can also byte-compile with no warning. I don't know why I had problems before, but it seems declare, at least on E25+ does no longer check if the declaration exists in defun-declarations-alist or macro-declarations-alist.

Sorry, I don't understand. What does the italicized "it" refer to in that sentence? :)

It refers to the analysis. The advertised type is taking an Int or String but you are passing in a Keyword. So this will also prompt you to either correct the code or the signature if this is really what you want to send in.

The repetition of the function name and superfluous :: irk me a bit. :) Plus, having elsa as the first element seems to make sense, as it explicitly marks it as for ELSA.

Yes, this is fine with me. I had it in the comment to prevent the case where you would copy away the function but not the comment, so that the annotation name and defun name had to match. This would prevent forgotten annotations attaching to wrong functions. This would be made obsolete if the annotation is inside the body. Which makes perfect sense.

alphapapa commented 6 years ago

This will make Elsa a dependency becasue the macro would have to be defined somewhere.

Yes, it would, but whether that's a problem... :)

I don't know about any macro that would just discard everything, ignore sadly is a function so there would even be an extra function call.

I meant to write a macro that evaluates to nil. Like:

(defmacro elsa-annotation (&rest _)
  nil)

But since you found that the declare form may work, which is great, this doesn't matter. :)

It refers to the analysis. The advertised type is taking an Int or String but you are passing in a Keyword. So this will also prompt you to either correct the code or the signature if this is really what you want to send in.

I see. So are annotations always necessary, then, if types can be inferred? (If I'm basically asking you to explain how everything in ELSA works, and you'd rather defer these questions, that's fine. :)

Fuco1 commented 6 years ago

@alphapapa In Flow (javascript) it works like this

function add1(x) {
    return x + 1;
}

add1(1) // flow now figures add1 takes an int

add1('foo') // this is not compatible with previous use, flow will
            // mark it as error: takes string but expects int

So it kind of infers the type from usage if there is no annotation. This of course might get wrong idea about what's going on because it would error out even if we handled the int/string difference inside the function (but maybe it got smart enough to know that you are testing for string and int).

So far I don't plan this so you will have to annotate your own top level functions. The Emacs built-ins will be annotated in Elsa (see elsa-typed-builtin.el for example). There's over 1500 functions to annotate so that will take some work. I'll try to recruit people on reddit or somewhere once we decide how to do the annotations.

alphapapa commented 6 years ago

Sounds good. I'll plan to help with some of that.

Fuco1 commented 6 years ago

I'm going to put this into the 0.1.0 milestone also. I think I'm good with changing my mind to support sexp annotations exclusively... we've come up with pretty good setup so it's not so horrible.

I will need to write down a summary of this discussion and write a proposal we can "vote" on. Then I'll convert all the internals (which I actually changed once before from yet another annotation syntax :D---not so much fun).

Fuco1 commented 6 years ago

I've created a pull request with a spec of lispified types: https://github.com/emacs-elsa/Elsa/pull/96 Feedback welcome, I would like to get it right this time.

a13 commented 5 years ago

Hi! Sadly, I can't comment in the PR, but want to add my $0.02

how about smth like

(defun add-one (x)
  (declare (elsa-> Number Number))
  (1+ x))

(defun add-one-and-debug! (x msg)
  (declare (elsa-> String Number Number))
  (message "%s" msg)
  (1+ x))

(defun add-one* (x)
  (declare (elsa-> (U Number String) Null))
  (1+ (if (stringp x)
          (string-to-number x)
        x))
  (message "added 1 to %s" x))

And, yes, I really like Typed Racket annotations, they are still sexpy, but don't conflict much with Haskell-like ones.

Fuco1 commented 5 years ago

@a13 That looks pretty good. I've also seen some clojure dialect (or maybe it was proposal for clojure itself) which had something similar.

I think I'm more and more changing my mind and leaning towards something closer to what defcustom is using, i.e. what we have in the PR. But if someone has some more ideas how to combine these ideas we're still open to change.

Fuco1 commented 2 years ago

The haskelly syntax is gone and the lispy annotations are now in the trunk.

Not all docs have been updated, I'm working on that. For now, if you want to use it, read the source :imp: