Closed sheey11 closed 6 months ago
Hi, I'm not sure I understand all of that but it might be caused by a common issue where you want to apply one of many parsers and just accept the result of the first one that works.
It might be caused by this code block:
pSymbol.init(pSymbolSubSuperScript.pipe(
or(
pSymbolSubscript,
pSymbolSuperSubScript,
pSymbolSuperscript,
)
))
Here, it's typical that a parser in the middle would succeed, but a previous parser has already consumed input and uses e.g. the then
combinator. This has two effects:
then
combinator, by default, expects the current parser to be valid. Technically, if some parser fails, it upgrades the failure to a hard failure (https://github.com/GregRos/parjs?tab=readme-ov-file#-hard-failure)You might understand what is happening by temporarily using .debug()
on the different parsers and re-running your parsing test (https://github.com/GregRos/parjs?tab=readme-ov-file#debugging).
What I think you want to do is recover
in this way:
it.only("can be used with backtracking", () => {
// Create a silly example for a shopping list with two possible sections.
// Notice that both start by parsing the same string. This is to show that
// the `recover` combinator can be used to backtrack to the beginning of the section
const fruitSection = string("Remember to buy ").pipe(
then(string("apples").pipe(or(string("bananas"))))
);
const vegetablesSection = string("Remember to buy ").pipe(
then(string("carrots").pipe(or(string("potatoes"))))
);
const shoppingList = fruitSection
.pipe(
recover(() => ({ kind: "Soft" })),
or(vegetablesSection)
)
.debug();
expect(shoppingList.parse("Remember to buy carrots")).toBeSuccessful([
"Remember to buy ",
"carrots"
]);
expect(shoppingList.parse("Remember to buy potatoes")).toBeSuccessful([
"Remember to buy ",
"potatoes"
]);
});
Sorry for late reply, I now understand why error occors and completed my entire parser, many thanks!
But I encountered another error while testing pExpression
:
ParserDefinitionError: manySepBy: The combinator 'manySepBy' expected one of its arguments to change the parser state.
I think it's caused by
pExpression.init(
pAnyExpr.pipe(
manySepBy(whitespace())
)
)
If I add maxIterations
argrument to manySepBy
, it works very well, but there shouldn't be a number limitation.
Even if I pipe thenq(eof())
then parse entire formula, this error still occur.
That usually occurs when a parser succeeds in parsing the empty string (""
). Looping over that will never finish, so the combinator simply crashes instead of going on forever.
I think in your case this can be solved with this:
const pSimpleSymbol: Parjser<SymbolComponent> = pValidOneChar
.pipe(
or(pEscape),
mustCapture(),
or(digit()),
many1(), // changed to many1() from many()
stringify(),
map(symbol => ({ type: FormulaType.Symbol, text: symbol }))
)
.debug(); // symbol has no subscript or superscript
I was able to track that down with the following steps:
.debug()
to pSimpleSymbol
and saw in the debug output that it succeeded in parsing the empty stringHere are some tests that I generated when debugging this (I generated these with copilot and only resolved the first issue I saw):
```ts
// tests
// console.dir(pNumber.parse("1213"), { depth: null })
// console.dir(pSimpleSymbol.parse("good123\\^"), { depth: null })
// console.dir(pSymbol.parse("weight_1^2"), { depth: null })
// console.dir(pSymbol.parse("weight^2_1"), { depth: null })
// console.dir(pSymbol.parse("weight"), { depth: null })
// console.dir(pSymbol.parse("weight_1_1"), { depth: null }) // expects error
// console.dir(pSymbol.parse("weight_1"), { depth: null })
// console.dir(pSymbol.parse("weight^1"), { depth: null })
// console.dir(pSymbol.parse("xyz_{1}^2"), { depth: null })
// console.dir(pList.parse("[y_1 w_1, y_2]"), { depth: null })
// console.dir(pFunctionCall.parse("matrix([y_1, y_2]))"), { depth: null })
// console.dir(pExpression.parse("y_1 y_2"), { depth: null })
// console.dir(pFormula.parse("y_1 y_2"), { depth: null });
describe("Formula Parser", () => {
it("should parse a number", () => {
expect(pNumber.parse("1213")).toBeSuccessful
I think many1
is not exported, I can't import from parjs/combinators
:
Should I fire a pr to add many1
?
Version 1.2.3 is now up on npm, can you test with that? It should have exported many1
.
With 1.2.3
my parser works fine now, I really appreciate your help, thanks!
I am new to ParserCombinators and I am trying to write a parser for a LaTeX-like langugae, examples:
matrix( [y_1], [y_2] ) = matrix([w_{11}, w_{21}], [ w_{12}, w_{22} ]) matrix([x_1], [x_2])
frac(roman(d) x, roman(d) y)
My code so far:
Constants and type definitions
```typescript import { anyChar, anyCharOf, anyStringOf, digit, eof, float, noCharOf, regexp, string, whitespace } from 'parjs' import { between, later, many, manySepBy, map, maybe, mustCapture, or, qthen, stringify, then, thenq } from 'parjs/combinators' enum FormulaType { Number = "number", Symbol = "symbol", Operator = "operator", Matrix = "matrix", Fraction = "fraction", List = "list", } interface FormulaComponentCommon { type: FormulaType } interface NumberComponent extends FormulaComponentCommon { type: FormulaType.Number, num: number, padding?: boolean, } interface SymbolComponent extends FormulaComponentCommon { type: FormulaType.Symbol, text?: string, superscript?: string, subscript?: string, roman?: boolean, } interface OperatorComponent extends FormulaComponentCommon { type: FormulaType.Operator, text: string, } interface MatrixComponent extends FormulaComponentCommon { type: FormulaType.Matrix, h: number, w: number, elements: FormulaExpression[][] } interface ListComponent extends FormulaComponentCommon { type: FormulaType.List, elements: FormulaExpression[] } interface FractionComponent extends FormulaComponentCommon { type: FormulaType.Fraction, upper: FormulaExpression, lower: FormulaExpression, } type FormulaComponent = NumberComponent | SymbolComponent | OperatorComponent | MatrixComponent | ListComponent | FractionComponent type FormulaExpression = FormulaComponent[] const functions = [ "matrix", "frac", "roman", ] const escapes = [ "\\", "{", "}", "(", ")", "[", "]", "_", "^", ] const mathSymbols = [ "alpha", "beta", "gamma", "epsilon", "ne", "ge", "le", "Delta", "partial", "int", ] ```as defined,
pSymbol = pSymbolSuperSubScriptif.pipe(or(...))
, ifpSymbolSuperSubScript
parsers a string successfully thenpSymbol
should also.