Aunsiels / pyformlang

A python library to manipulate formal languages and various automata
https://pypi.org/project/pyformlang/
MIT License
45 stars 10 forks source link

Trouble with Regex #4

Closed SergeyKuz1001 closed 3 years ago

SergeyKuz1001 commented 3 years ago

Hello,

When I was using regular expressions in my program, I found some unusual behavior.

For example, I can write this:

Python 3.9.0 (default, Nov 18 2020, 13:28:38) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyformlang.regular_expression import Regex
>>> Regex('a a | a')
((a.a)|a)
>>> Regex('a . a | a')
((a.a)|a)
>>> Regex('a* a | a')
(((a)*.a)|a)

So, I realise that priority of operator . more than priority of operator |. But when I use * just before |, priority of this operations changes:

>>> Regex('a a* | a')
(a.((a)*|a))
>>> Regex('a . a* | a')
(a.((a)*|a))
>>> Regex('a . a* | (a)')
(a.((a)*|a))
>>> Regex('a* a* | a')
((a)*.((a)*|a))

I can define expression ((a.(a)*)|a) only through using parenthesis:

>>> Regex('(a a*) | a')
((a.(a)*)|a)

And error isn't in printing:

>>> Regex('b a* | a').accepts('b')
True
>>> Regex('b a* | a').accepts('a')
False
>>> Regex('(b a*) | a').accepts('a')
True
>>> 
Aunsiels commented 3 years ago

Hi,

Thank you for noticing that! I fixed the problem. Now:

Regex('b a* | a').accepts('b')
# True
Regex('b a* | a').accepts('a')
# True
Regex('(b a*) | a').accepts('a')
# True