alex-gutev / cl-form-types

Library for determining the types of Common Lisp forms based on information stored in the environment.
MIT License
19 stars 1 forks source link

Going beyong CLTL2: constant-form-value and compiler-macroexpansion #1

Closed digikar99 closed 3 years ago

digikar99 commented 3 years ago

Hi

Is there an interest in going beyond CLTL2, and/or providing a form-subtypep? For instance,

CL-USER> (trivial-form-type:form-type '(+ 2 3) nil) ; uses introspect-environment:constant-form-value
(EQL 5)
T
CL-USER> (subtypep (trivial-form-type:form-type '(+ 2 3) nil) '(unsigned-byte 8))
T
T
CL-USER> (cl-form-types:form-type '(+ 2 3) nil)
(VALUES NUMBER &OPTIONAL)
CL-USER> (subtypep (cl-form-types:nth-form-type '(+ 2 3) nil 0) '(unsigned-byte 8))
NIL
T

Such functionality may not be provided in the core system for reasons of cltl2-based portability, but a subsystem could provide it when users decide to use it.

alex-gutev commented 3 years ago

I have plans for going beyond the CLTL2 and the base type system. I believe the behaviour you're demonstrating above is implemented using SB-INT::CONSTANT-FORM-VALUE, on SBCL, which can be mimicked with CONSTANTP on other implementations, though obviously not with the same results.

alex-gutev commented 3 years ago

Commit a9b4f94 adds checks for constant forms, before processing them. On SBCL the type of the value returned by SB-INT::CONSTANT-FORM-VALUE is returned, on other implementations EVAL is used.

CL-USER> (cl-form-types:form-type '(+ 2 3) nil)
(EQL 5)
digikar99 commented 3 years ago

Erm, for non-CLTL2 functionality, you could directly depend on https://github.com/Bike/introspect-environment since it already has made a reasonable effort at supporting CMUCL and CCL alongside SBCL - and possibly more if people contribute.

digikar99 commented 3 years ago

Another two potentially-smallish issues -

  1. Using COMPILER-MACRO-FUNCTION to introspect-environment:compiler-macroexpand,
  2. cl-form-types:form-type could return two values indicating whether a type T (could it be some other type?) was the result of user declarations or if it was the result of failure to infer / lack of declarations. On https://github.com/digikar99/adhoc-polymorphic-functions/ I find the second return value useful to be able to signal a useful compiler note for (optimize safety).
CL-USER> (defun baz (a b)
           (declare (type string a)
                    (type integer b)
                    (optimize safety))
           (my= a b))
; While compiling (MY= A B):
;
;   No applicable POLYMORPH discovered for TYPE-LIST (STRING INTEGER).
;   Available TYPE-LISTs include:
;      ((SIMPLE-ARRAY SINGLE-FLOAT) (SIMPLE-ARRAY SINGLE-FLOAT))
;      (CHARACTER CHARACTER)
;      (STRING STRING)
BAZ
CL-USER> (defun baz (a b)
           (declare (optimize safety))
           (my= a b))
BAZ

No such note is signalled if types are not declared / derivable.

alex-gutev commented 3 years ago

OK I see it implements constant-form-value form more than just SBCL? I will use it then. I'm not sure expanding compiler-macros is a good idea, if that's what you meant, there is no guarantee they will be expanded by the implementation. I will add a second return value to FORM-TYPE which is true if it was able to determine the type or not.

digikar99 commented 3 years ago

About the second one, on a rethought, perhaps that is not required; things seem to work correctly even without the second value -

(defun our-typep (arg type)
  (assert *compiler-macro-expanding-p*)
  (when (type= t type) (return-from our-typep t))
  (let ((form-type (form-type arg *environment*)))
    (if (eq t form-type)
        (signal 'form-type-failure :form arg)
        (subtypep form-type type *environment*)))) ; (edited after the initial writeup)

The above seems to work. I hope to get back with the reasoning in some time; you can avoid adding it until then.

digikar99 commented 3 years ago

there is no guarantee they will be expanded by the implementation.

Yes, but it looks like a bad behavior on the part of user themselves if their compiler-macros produce a result that wouldn't be produced by the corresponding function. The user themselves shouldn't depend on the compiler macros being expanded for the correct result (and should only depend on them for optimization purposes?).

Back over at adhoc-polymorphic-functions and trivial-form-type, on (optimize speed) I surround the calling form with its return type obtained by compiler macroexpansion, so that the next entity on the path can know the type at macroexpansion/compiler-macroexpansion time itself.

For instance, consider a supposedly useless example

(define-polymorphic-function my-cat (a b) :overwrite t)

(defpolymorph my-cat ((a string) (b string)) string
  (concatenate 'string a b))

(defpolymorph my-cat ((a list) (b list)) list
  (append a b))

And we wish to optimize

(compile nil `(lambda (a b c)
                (declare (optimize speed)
                         (type list a b c))
                (my-cat a (my-cat b c))))

The return type of (my-cat b c) is dependent on the types of its arguments. In the case the arguments are list, the return type is list - accordingly, the compiler-macroexpansion of (my-cat b c) results in a form (the list ...) which (my-cat a ...) can then utilize to optimize its own call to finally statically dispatch on the second polymorph specialized on list.

Without cl-form-types or trivial-form-type handling compiler macroexpansion, the compiler macro can of my-cat can never (or at least I don't see how) know what (my-cat a ...) should optimally compile into.

EDIT: On more thought: this way of doing things means that the compiler-macro is called O(n^2) times for a compilation n levels deep, which looks bad and especially avoidable. An alternative is to first compiler-macroexpand - or even macroexpand - the arguments, and then call form-type on these expanded arguments, and actually compile the original form into this expanded version of the arguments; this looks O(n) much better than O(n^2). So, compiler macroexpansion support may not be required on cl-form-types and trivial-form-type. It'd be "nice to have" just as macroexpansion is taken into account, but not terribly necessary.

EDIT2: The actual time due to the O(n^2) vs O(n) is actually insignificant; this could be a PMO.

alex-gutev commented 3 years ago

I see the value of it. I'll add a keyword argument :EXPAND-COMPILER-MACROS which if true, compiler-macros will be expanded and the type of the resulting form determined.

digikar99 commented 3 years ago

Okay, that will be useful, because simply (compiler)macroexpanding will do nothing to all the special forms; for the special forms, one has to rely on cl-form-types, and it could internally encounter compiler-macros.

digikar99 commented 3 years ago

And about the second return value: for my use case, a form-type of t without a (type= type t) (referring to the above comment) necessarily means that the type is not declared. It also happens that (declare (type t a)) does not yield any value with cltl2:variable-information. There doesn't seem to be anything more to this. form-type with value t is only relevant if (type= type t) is true; otherwise it is necessarily due to a lack of declarations.

alex-gutev commented 3 years ago

Declaring the type of a variable to be t is pretty useless since you're simply stating it can be of any type, therefore the declaration does not add any useful information. Now if the form type is the literal value t FORM-TYPE returns (EQL T).

alex-gutev commented 3 years ago

Added optional compiler-macroexpansion in commit 8e8fbaf