Open EliasC opened 11 months ago
So I would write that as
In(X) * T(Y) << ((!T(Z))++ * T(Z) * End)
The current implementation does not backtrack the | or ++ matching based on the continuation. For some of the optimisations I applied, I actually carry the continuation around, so that could be possible to do.
There is another similar case:
((T(X) * T(Y)) | (T(X)) * T(Z)
This matches XYZ or XZ. But
((T(X) | (T(X) * T(Y))) * T(Z)
only matches XZ.
Are these optimisations of the Trieste code, or optimisations written in the client code?
Are these optimisations of the Trieste code, or optimisations written in the client code?
Trieste code optimisations. Before a pattern like
T(A) * T(B) * T(C)
would result in for dynamic dispatch calls before it tries to see if there is a T(A). By CPS converting the pattern it becomes one, and the continuation is a dynamic dispatch as well.
E.g. here we set_continuation
https://github.com/microsoft/Trieste/blob/0b2ec3c5ea7bf52b0f6b2ed9e4f9d571aba47a8b/include/trieste/rewrite.h#L957
This could form the basis for the better backtracking, but would need care given the other optimisations.
I was expecting the pattern
Any++ * T(Bar)
to match the sequenceFoo Foo Foo Bar
, just as the regular expression.*b
matchesaaab
, but it seems like theAny++
matches all ofFoo Foo Foo Bar
, meaning the whole pattern fails to match (since there is noBar
after that sequence). I don't know if this is by design, nor what the consequences would be of making it more like the kleene star, but I thought I would bring it up for discussion.My use case was for matching a
Y
inside anX
, where the last child ofY
is aZ
(for reasonable values ofX
,Y
, andZ
):If I didn't care about both the
X
and theY
I could just haveT(Z) * End
, but I now it's important that theY
is in anX
and has the children specified. In this particular case I can do(!T(Z))++ * T(Z)
, but in general this might not be satisfactory.