copecog / org.unaen.cl.lexer

A somewhat literal lexer to deeper learn me CL
Other
1 stars 0 forks source link

When comes the scanner/lexer? #1

Open gwangjinkim opened 2 months ago

gwangjinkim commented 2 months ago

After I complete this, I want to write a parser for context free grammars that couples with this scanner so that I can generate scanner/lexer functions for regular expressions and perl-style regular expressions.

When will this be? Looking forward!

I find your code is super organized and I learn about how to name more efficiently. Thanks!

copecog commented 2 months ago

I need to get back to it. I haven't been doing anything in CL for a while, and I never finished it. I think I would start again at this point as I don't really remember it and it looks like I made it overly complex. I'm a lifelong coding hobbyist and mostly do dev/ops, so I wouldn't take anything I did as good form, but I appreciate it.

Regards, Chris

On Thu, Aug 29, 2024 at 12:46 PM Gwang-Jin Kim @.***> wrote:

After I complete this, I want to write a parser for context free grammars that couples with this scanner so that I can generate scanner/lexer functions for regular expressions and perl-style regular expressions.

When will this be? Looking forward!

I find your code is super organized and I learn about how to name more efficiently. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/copecog/org.unaen.cl.lexer/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATXGUMEVT2FLTH5FVYKYEMTZT5T73AVCNFSM6AAAAABNLAQFSSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ4TKMRZHA3TKOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gwangjinkim commented 2 months ago

compared to edicl/cl-ppcre which is the gold standard for regex in CL - it has an internal representation of regexes using lisp expressions - was your idea something simlar? It is interesting what you were thinking of as a final result - before/while you started to write this.

copecog commented 2 months ago

Yeah, internal representation as a regular list expression or "reglex" (e.g. https://github.com/copecog/org.unaen.cl.lexer/blob/c4e276b2ecf9b629907c4144ee5ebb4f3696dab4/testing.lisp#L5 )

I wanted to write something that returned data structures representing the automata or state machines, those could be used directly to tokenize an input, however I wanted to also write something that could walk those data structures and generate pure functions. Then I could generate functions for standard regex and grammars, and languages could be specified in those that could generate functions to parse those languages, etc.