Shen-Language / shen-cl

Shen for Common Lisp (Unmaintained)
BSD 3-Clause "New" or "Revised" License
122 stars 11 forks source link

Using packages for namespacing instead of read-table casing #40

Closed tizoc closed 5 years ago

tizoc commented 5 years ago

To follow the discussion started here: https://github.com/Shen-Language/shen-cl/pull/39#issuecomment-537100507

rkoeninger commented 5 years ago

Picking up a comment from that PR:

I see, but if the read-table is left as-is, both IF and if will work, right? ... functions in another namespace, no read-table shenanigans involved

That's what I had in mind too, but Shen is supposed to be case sensitive, so a snippet of code like (f F f) would call the global function f with the local variable F as the first argument and the idle symbol f as the second argument. CL is case-insensitive by default and would confuse f and F. When functions are declared in the default case-insensitive mode, their under-the-hood name is upper-cased, as are all of the standard library functions and special-form-identifying symbols like IF.

So to have a valid Shen implementation, you have to set the read-table to case-sensitive. And when you do that, you have to start referring to standard functions and symbols in uppercase.

I think having them in separate CL packages is still a good thing to avoid any potential leakage - local variables in Shen can be all upper-case and potentially conflict with something CL.

The ideal approach would be to have per-package case-sensitivity, which I thought could not be done, by a simple search is turning up a few things:

case-sensitive packages in CLisp

In fact you can have case-sensitive Common Lisp packages that are completely backwards compatible with case-insensitive CL code

No time to experiment with this at the moment.

tizoc commented 5 years ago

Ok, I get it now, thank you. Will get back to this once the other changes are done, not a priority right now.

Drainful commented 5 years ago

It seems like Shen almost fits into CL as a set of macros that can exist in a package, but not quite. I think the best solution would be to abandon trying to embed Shen into CL via the readtable or package system. This project might be of interest as a codified way of embedding languages within Common Lisp. It provides an interface for using different languages separate from the package system which may call arbitrary code as a reader (resembling Racket's "#lang" system). The Common Lisp reader, customized to be case sensitive, could still be used in this case.

tizoc commented 5 years ago

Interesting, will keep that one in mind.

Note that what we need is not a reader for Shen, just for Common Lisp code generated from a Klambda input (the Shen kernel gives ports Klambda code, and the port gives $native code to the underlying platform).

What if we have the compiler output Klambda symbols (which have to have their case preserved) as |lowercase-symbol|, would that work? I don't know if that syntax is portable, just tried it with SBCL, but I guess it is.

rkoeninger commented 5 years ago

What if we have the compiler output Klambda symbols (which have to have their case preserved) as |lowercase-symbol|, would that work?

Stack Overflow about |...|

Apparently, this is a standard thing. Case-preserved symbols can be generated with INTERN as its the reader that up-cases them. Symbol returned from INTERN is printed surrounded in |'s if it has lower-case or non-symbol characters. (format nil "~S" (intern "aBc")) returns "|aBc|". But if you use ~A like (format nil "~A" (intern "aBc")), it returns "aBc"

It would still be necessary to qualify KL and Shen symbols or isolate them in another package from the rest of the CL namespace. CL's IF and Shen/KL's if can only coexist because of casing differences. Could write |if| in the .lsp code instead of using the readtable. Uses of if I hope would then get |'ed.

tizoc commented 5 years ago

Could write |if| in the .lsp code instead of using the readtable.

Yes, thats the idea. Things will work the same as now, what changes is that symbols are forced to be read with their case preserved by using this syntax instead of messing with the read-table.

rkoeninger commented 5 years ago

One more note: when the readtable is set to preserve case, (FORMAT ... "~S" ...) no longer puts the |'s around symbols when printed.

Drainful commented 5 years ago

I think having them in separate CL packages is still a good thing to avoid any potential leakage - local variables in Shen can be all upper-case and potentially conflict with something CL.

The safest way would be to have multiple packages for Shen:

  1. A backend package which USEs the common lisp package and exports Shen symbols (along with CL symbols which become part of Shen like 'CL::+)
  2. A Shen package which only USEs the backend package, and not the Common Lisp package

That way we can control what leaks through from Common Lisp to Shen by exporting just what is needed from the backend package.

rkoeninger commented 5 years ago

control what leaks through from Common Lisp to Shen

In other ports I've worked on, partly because they're so different from CL, and to maintain namespace hygiene, interop functions have to be explicitly added by the host environment like shenEnv.define('increment', x => x + 1). Are you saying it would work like that? Would there be a shen-cl.import function to make CL functions available? If it made them available without a prefix, what happens if you import adjoin - does it override the adjoin in the kernel - or does the imported one have a lisp. prefix? The lisp. prefix could also be compiled to a CL:: prefix, maintaining namespace hygiene.

tizoc commented 5 years ago

I'm giving it a try to porting the Shen/Scheme compiler to Shen/CL, if this works experimenting with these things is going to be quite easy.

tizoc commented 5 years ago

I have a version of the new compiler working already, I just need to clean it up a bit before releasing it. One thing it does is that it maintains the casing of lowercase symbols using |, for example:

(DEFUN |shen.packaged?| (V3068)
  (BLOCK NIL
    (TAGBODY
      (IF (CONSP V3068)
          (LET ((|V3068/tl| (CDR V3068)))
            (IF (AND (EQ (CAR V3068) '|package|)
                     (AND (CONSP |V3068/tl|) (CONSP (CDR |V3068/tl|))))
                (RETURN 'TRUE)
                (GO |%%label959|)))
          (GO |%%label959|))
     |%%label959|
      (RETURN 'FALSE))))

I haven't tried disabling the readtable modifications yet, but with this change it shouldn't be required anymore.

rkoeninger commented 5 years ago

If the readtable no longer gets set to :PRESERVE, Will the lisp. prefix still cause an upcase of the identifier like it does now? (lisp.qwerty -> QWERTY) Will it need to? Or will lisp. somehow prevent the addition of the |'s?

tizoc commented 5 years ago

The lisp. prefix uppercases the symbol.

Edit: thats what it does now, will it be required later? I think yes, because the Shen reader interns symbols preserving the case, the Common Lisp reader never sees the original text. The |s are for the generated files, which do pass through Common Lisp's reader.

tizoc commented 5 years ago

With #44 merged I guess this can be considered solved.

The read-table is not messed up with anymore, and now all hand-written lisp code is in lower case. Shen's reader is case-sensitive, so all the symbols it feeds to Common Lisp have their case preserved. When compiling the kernel and compiler into lisp code the build script takes care of bar-quoting cased symbols with | so that the Common Lisp reader preserves their case.

Like before, Shen code is compiled into the :shen package.