lcompilers / lpython

Python compiler
https://lpython.org/
Other
1.5k stars 157 forks source link

Apparent unmatched double quote in Clojure-style output #1505

Open rebcabin opened 1 year ago

rebcabin commented 1 year ago

This command (compiling tests/expr7.py)

("/Users/brian/Documents/GitHub/lpython/src/bin/lpython"
 "-I/Users/brian/Documents/GitHub/lpython/src/runtime/ltypes/ltypes.py"
 "--show-asr"
 "--no-color"
 "--with-intrinsic-mods"
 "/Users/brian/Documents/GitHub/lpython/tests/expr7.py")

produces the following sub-form, which has (apparently) an unmatched double-quote in it, right after the ! character:

(StringConstant " !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~" (Character 1 97 () []))

it probably should be

(StringConstant " !\"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~" (Character 1 97 () []))

with the double-quote escaped.

HOWEVER

This issue, #1505, corrected or not, raises Issue #1420 to BLOCKER status, because the colon embedded in the string constant gets caught by my ad-hoc, regex-based keyword patcher. I don't know any easy way to match foobar: but not " !\"#$%&...789:;<=...". My ad-hoc, regex-based keyword patcher correctly changes foobar: to :foobar, but incorrectly reorders the string constant from

" !\"#$%&...789:;<=..."

to

" :!\"#$%&...789;<=...".

With this issue, #1505, UN-corrected, my ad-hoc, regex-based keyword patch changes

" !"#$%&...789:;<=..."

to

" :!"#$%&...789:;<=..."

which has an illegal Clojure reader macro, #$%&... in it.

Suffice it to say that Issue #1420 and this issue, #1505, are now both BLOCKERs.

SUPPLEMENTARY SCREEN SHOT

image
Ahmedkhalifa1999 commented 1 year ago

Is this issue, along with #1420 and #1498 still open? I would like to work on fixing the output (colons and unmatched quote), but will need some guidance on what needs to be done to make it canonical Clojure.

rebcabin commented 1 year ago

I'm working on various Clojure-tools for asr. #1420 and this issue are the big ones, for now. It would be nice if .true. and .false. were true and false, respecively, but that's not necessary.

An optional verbose mode, in which ASR terms became Clojure maps, would be grand, too. Example:

        '{:head Variable
          :parent-symtab-id 2
          :name x
          :intent Local
          :symbolic-value '{:head IntegerConstant
           :value 5,  :ttype {:base-type 'Integer :kind-bytes 4 :dims []}}
...

instead of

        '(Variable        ; head
          2               ; parent-symtab
          x               ; nym
          Local           ; intent
          (IntegerConstant
           5 (Integer 4 [])) ; symbolic-value
...
rebcabin commented 1 year ago

The big picture is to validate and generate ASR terms via clojure.spec. This amounts to a type system for ASR. It's a big, long-term job. See https://github.com/rebcabin/asr-tester/blob/main/src/asr/specs.clj and surrounding code for examples, especially https://github.com/rebcabin/asr-tester/blob/main/test/asr/core_test.clj.

I am seeking contributors to those projects :)

My code is currently in a broken state because ASR changed more quickly than I could keep up. I am contemplating an alternative https://github.com/rebcabin/masr , to replace ASDL with clojure spacs.

Ahmedkhalifa1999 commented 1 year ago

I have an idea, though my information will need verification from someone who knows the codebase better than me. What if we added an optional argument that formatted the output tree in Clojure-style (at least temporarily)? The rationale from my side is that this will allow more freedom with the output (no need to worry about other uses of the ASR output) and also keep the output working if anyone else is relying on it for a different purpose. If this is a valid idea, I can start working on it with a little guidance with the code base and some significant guidance about what is and is not allowed in Clojure.

I would be very interested to contribute to those projects in my free time, but I will need to learn Clojure first, since it will be my first encounter with it or Lisp :)

rebcabin commented 1 year ago

I think that's an excellent idea! We need not derange the current code base, simply implement something like --show-canonical-clojure and party in our own back yard. If you can begin by cracking into how --show-asr works, i can provide some guidance on the C++ side and a lot of guidance on the clojure side.

rebcabin commented 1 year ago

Clojure has a bit of a learning curve, but it's very pleasant once you get over it. Here is a tiny, recent project from me on a tiny type-system (one day, far future, for ASR). You might find it amusing: https://github.com/rebcabin/concurrent-kittens

I have dozens more clojure projects. I put clojure into production at Amazon. It's grand and glorious.

Ahmedkhalifa1999 commented 1 year ago

That's great. I will start getting to know how --show-asr works, look into the basics of Clojure and come back with questions.

rebcabin commented 1 year ago

https://clojuredocs.org/ is invaluable

https://www.braveclojure.com/ is priceless, right from the start

Shaikh-Ubaid commented 1 year ago

We need not derange the current code base, simply implement something like --show-canonical-clojure and party in our own back yard

We currently have

I think similarly we can have --show-asr --clojure for ASR Clojure format.

rebcabin commented 1 year ago

If you write out Clojure maps (hashmaps) like {:head Variable, :parent-symtab-id 2, ,,,} then it will be very easy for me to convert them into Clojure defrecords, to subject them to Clojure defprotocol and clojure.specs. So I guess I will call that a requirement for a verbose Clojure mode.

rebcabin commented 1 year ago

Once I have defrecord for each term in the ASDL grammar, it's reasonably easy to convert them to Clojure vector and hence to walk them with clojure.zip. I can also walk them with clojure.walk, but that's more unfamiliar to me. I should look into clojure.walk. I am very friendly with clojure.zip.

rebcabin commented 1 year ago

It's important to distinguish symtabs (inlined lists) from symtab-ids. The ASDL grammar does not distinguish them, but they are of different types (list versus integer). That failure to distinguish has caused me heartaches in the past.

certik commented 1 year ago

Let's do --show-asr --clojure. Later, if we decide to make it the default, we can introduce --show-asr --old for the old format.

rebcabin commented 1 year ago

Here is an embeddable clojure-inspired language https://janet-lang.org/

It's implemented in C, so acquires some benefits of Clojure without the JVM. I'm looking into it for potential use in downstream processing of ASR in Clojure-syntax.

EDIT: it's also potentially useful as a way to print Clojure syntax, as opposed to writing a boat-load of statically linked C++. Its binary is around 1 MB. If someone uses an lcompiler with --show-asr --clojure, one might be willing to take the small hit at run time of loading an extra .so (.dylib) and executing some bytecode in a VM like janet's.

rebcabin commented 1 year ago

I did something similar to this in Prime Air, though I used femtolisp from the Julia creators rather than janet. The controllers for the Prime-Air robotic drones needed testing, big, heavy, long-running fuzz testing over the output strings of various sensor packages. The tests were written in femtolisp, which was realized as a .so. At test time, the embedded controller code would load the femtolisp .so and tell it to run the test scripts. The test scripts usually ran for 24 hours or more. They found bugs that would have crashed drone aircraft -- usually improper handling of garbled messages from sensors.

Ahmedkhalifa1999 commented 1 year ago

This looks intersting. I will definitely look into the idea of using Janet for the task, Right now, I am getting up to speed with Clojure, and it is fun. I fell in love with functional programming when I learned Scala 3 years ago, but never had the chance to use it since.

rebcabin commented 1 year ago

You get programmable types in Clojure via clojure.spec, so you will lose nothing and gain much by switching from Scala IMO :)

Ahmedkhalifa1999 commented 1 year ago

Sorry I was off for a few days, exams week and had some assignments to finish. I finished the first 3 chapters of Clojure for the Brave and True, and feel like I have some grasp of Clojure now, since I knew functional programming from before. I will start looking into the code now to understand how --show-asr works, while looking into Janet for embedding and hopefully start working on implementing the --clojure option soon.

I looked in clojure.spec and I am wowed to say the least. I am looking forward to continue with Clojure, and hopefully find some way to use it at work or for personal projects. code generation maybe? :)