smlnj / legacy

This project is the old version of Standard ML of New Jersey that continues to support older systems (e.g., 32-bit machines).
BSD 3-Clause "New" or "Revised" License
25 stars 10 forks source link

Regex results do not match nesting of groups in `ThompsonEngine` #301

Open Skyb0rg007 opened 3 months ago

Skyb0rg007 commented 3 months ago

Version

110.99.4 (Latest)

Operating System

OS Version

No response

Processor

System Component

SML/NJ Library

Severity

Minor

Description

From the MatchTree structure:

The tree structure corresponds to the nesting of groups in the regular expression.

However this does not seem to be implemented.

Transcript

- val re = Option.valOf (StringCvt.scanString AwkSyntax.scan "(a(b)a)c")
val re = Concat [Group (Alt [Concat [Char #"a", Group (Alt [#]), Char #"a"]]), Char#"c"] : syntax
- StringCvt.scanString (ThompsonEngine.find (ThompsonEngine.compile re)) "abac";
val it = SOME (Match ({len=4,pos=0}, [])) : StringCvt.cs match option

Expected Behavior

- val re = Option.valOf (StringCvt.scanString AwkSyntax.scan "(a(b)a)c")
val re = Concat [Group (Alt [Concat [Char #"a", Group (Alt [#]), Char #"a"]]), Char#"c"] : syntax
- StringCvt.scanString (ThompsonEngine.find (ThompsonEngine.compile re)) "abac";
val it = SOME (Match ({len=4,pos=0}, [Match ({len=3,pos=0}, [Match ({len=1,pos=1}, [])])])) : StringCvt.cs match option

Steps to Reproduce

See transcript

Additional Information

I'm not sure if the regexp module is used, since the BackTrackEngine doesn't seem to be maintained. If this is the intended behavior, I would just like to see the expected behavior documented properly.

Email address

skyler.soss@gmail.com

JohnReppy commented 3 months ago

Currently, only the BackTrackEngine structure provides support for groups. I'm converting this to a enhancement request.

dmacqueen commented 3 months ago

As someone who has not used the RegExp library, I don’t have much to contribute. But I am certainly open to volunteer contributors adding functionality and improving or augmenting the documentation of the libraries. In other words, I think smlnj-lib should be treated as “open source” in the broad sense (read/write rather than read only). I also think it is good to have an “owner” or “editor” who is responsible for quality control of contributions to particular libraries.

Dave

On Mar 5, 2024, at 3:40 PM, John Reppy @.***> wrote:

Currently, only the BackTrackEngine structure provides support for groups. I'm converting this to a enhancement request.

— Reply to this email directly, view it on GitHub https://github.com/smlnj/legacy/issues/301#issuecomment-1979819514, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXNPLOZRZEEIG33JZ3OELYWZJVJAVCNFSM6AAAAABEIAQG62VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZHAYTSNJRGQ. You are receiving this because you are subscribed to this thread.

David MacQueen @.***