caleb531 / automata

A Python library for simulating finite automata, pushdown automata, and Turing machines
https://caleb531.github.io/automata/
MIT License
338 stars 63 forks source link

Allow non-alphanumeric literal characters in regular expressions #115

Open caleb531 opened 1 year ago

caleb531 commented 1 year ago

@eliotwrobson Per your brief comment from #112:

Currently, only alphanumeric characters are supported in the regex parsing here (which I'm now realizing is kindof a breaking change from v6, whoops). But I think some way of adding escape characters could be useful to people. I think it's just a matter of reconfiguring the lexer to treat characters coming after a backslash as a literal. But I think it should definitely be in a separate PR.

I think it would be helpful to allow non-alphanumeric characters in a regex, such that you could create a regex for an email address, @username, etc. This enhancement would imply that you can also escape symbols to be literal characters.

eliotwrobson commented 1 year ago

I think this is a reasonable change to make. All it should require is a change to the regular expressions given to the lexer for each token type. You'll have to prevent it from matching against characters with a slash in front.

Also, the regular expression used for the LiteralToken needs to be modified to account for this.

EDIT: @caleb531 if you want to attack this, this might go well along with the initial refactor I referenced in #109. I have exams coming up and I'm not a huge regex expert 😅

eliotwrobson commented 1 year ago

@caleb531 looking over this again (since it is on the v8 milestones), I'm not knowledgeable enough about regex syntax to make this work as expected. This requires messing with the regex expressions used for lexing. I certainly think it would be nice to include this with v8, and I can help if anyone wants to take a crack at this, but I can't craft the expressions needed myself 😢

caleb531 commented 1 year ago

@eliotwrobson That's fine! I'm not that committed to having this be part of v8, so I've removed the milestone designation. I wouldn't want this to hold up the Jupyter integration's debut in the coming v8.

leonbett commented 10 months ago

Hi! I'd be interested in this feature. [works with 6.0.0 though, which is nice!]

eliotwrobson commented 10 months ago

@leonbett thanks for indicating interest! I would definitely accept, and potentially even collaborate on a PR that added this feature, but I'm not knowledgeable enough about regexes to write it myself.