trueagi-io / hyperon-experimental

MeTTa programming language implementation
https://metta-lang.dev
MIT License
153 stars 50 forks source link

String parsing with new lines is broken #780

Closed Necr0x0Der closed 1 month ago

Necr0x0Der commented 1 month ago

Describe the bug After this PR: https://github.com/trueagi-io/hyperon-experimental/pull/777/ unit tests in metta-motto became broken.

To Reproduce Run the test in metta-motto:

!(assertEqual
   ((echo-agent)
    (system "Ping") (user "Pong"))
"system Ping
user Pong")

While echo-agent returns

"system Ping
user Pong"

which is composed as '\n'.join("system Ping", "user Pong"), it cannot be matched with the string formed by parsing in Metta

"system Ping
user Pong"

This:

!(assertEqual
   ((echo-agent)
    (system "Ping") (user "Pong"))
"system Ping\nuser Pong")

also doesn't work.

Expected behavior There should be a way to specify multi-line strings in metta scripts, which could be matched against corresponding strings formed in Python.

Necr0x0Der commented 1 month ago

The issue seems that

"system Ping
user Pong"

is not converted into Python String by

r"^\".*\"$": lambda token: ValueAtom(str(token[1:-1]), 'String'),

and is parsed as Rust string. An attempt to write:

m = MeTTa()
x = m.run('''
! "A
B"
''')
print(x[0][0].get_object().content == '''A
B''')

results in

Cannot get_object of unsupported non-C "A
B"

Basically, the corresponding regex in Python is not invoked in this case. While if we return the old one

"\"[^\"]*\"": lambda token: ValueAtom(str(token[1:-1]), 'String')

it works here.

vsbogd commented 1 month ago

Interesting, probably, because $ matches end of line.

vsbogd commented 1 month ago

First guess was correct: . inside regex doesn't recognize \n and because of this te\nst is parsed as a symbol atom. Fixed in #783