Arcsecond is a zero-dependency, Fantasy Land compliant JavaScript Parser Combinator library largely inspired by Haskell's Parsec.
The arcsecond-binary peer library includes parsers specifically for working with binary data.
Since version 2.0.0, the release notes track changes to arcsecond.
npm i arcsecond
The tutorials provide a practical introduction to many of the concepts in arcsecond, starting from the most basic foundations and working up to more complex topics.
You can use ES6 imports or CommonJS requires.
const {char} = require('arcsecond');
const parsingResult = char('a').fork(
// The string to parse
'abc123',
// The error handler (you can also return from this function!)
(error, parsingState) => {
const e = new Error(error);
e.parsingState = parsingState;
throw e;
},
// The success handler
(result, parsingState) => {
console.log(`Result: ${result}`);
return result;
}
);
git clone git@github.com:francisrstokes/arcsecond.git
cd arcsecond
npm i
# json example
node -r esm examples/json/json.js
The examples are built as es6 modules, which means they need node to be launched with the -r esm
require flag, which allows import and export statements to be used.
.run
is a method on every parser, which takes input (which may be a string
, TypedArray
, ArrayBuffer
, or DataView
), and returns the result of parsing the input using the parser.
Example
str('hello').run('hello')
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
The .fork
method is similar to .run
. It takes input (which may be a string
, TypedArray
, ArrayBuffer
, or DataView
), an error transforming function and a success transforming function, and parses the input. If parsing was successful, the result is transformed using the success transforming function and returned. If parsing was not successful, the result is transformed using the error transforming function and returned.
Example
str('hello').fork(
'hello',
(errorMsg, parsingState) => {
console.log(errorMsg);
console.log(parsingState);
return "goodbye"
},
(result, parsingState) => {
console.log(parsingState);
return result;
}
);
// [console.log] Object {isError: false, error: null, target: "hello", data: null, index: 5, …}
// -> "hello"
str('hello').fork(
'farewell',
(errorMsg, parsingState) => {
console.log(errorMsg);
console.log(parsingState);
return "goodbye"
},
(result, parsingState) => {
console.log(parsingState);
return result;
}
);
// [console.log] ParseError (position 0): Expecting string 'hello', got 'farew...'
// [console.log] Object {isError: true, error: "ParseError (position 0): Expecting string 'hello',…", target: "farewell", data: null, index: 0, …}
// "goodbye"
.map
takes a function and returns a parser does not consume input, but instead runs the provided function on the last matched value, and set that as the new last matched value. This method can be used to apply structure or transform the values as they are being parsed.
Example
const newParser = letters.map(x => ({
matchType: 'string',
value: x
});
newParser.run('hello world')
// -> {
// isError: false,
// result: {
// matchType: "string",
// value: "hello"
// },
// index: 5,
// data: null
// }
.chain
takes a function which recieves the last matched value and should return a parser. That parser is then used to parse the following input, forming a chain of parsers based on previous input. .chain
is the fundamental way of creating contextual parsers.
Example
const lettersThenSpace = sequenceOf([
letters,
char(' ')
]).map(x => x[0]);
const newParser = lettersThenSpace.chain(matchedValue => {
switch (matchedValue) {
case 'number': return digits;
case 'string': return letters;
case 'bracketed': return sequenceOf([
char('('),
letters,
char(')')
]).map(values => values[1]);
default: return fail('Unrecognised input type');
}
});
newParser.run('string Hello')
// -> {
// isError: false,
// result: "Hello",
// index: 12,
// data: null
// }
newParser.run('number 42')
// -> {
// isError: false,
// result: "42",
// index: 9,
// data: null
// }
newParser.run('bracketed (arcsecond)')
// -> {
// isError: false,
// result: "arcsecond",
// index: 21,
// data: null
// }
newParser.run('nope nothing')
// -> {
// isError: true,
// error: "Unrecognised input type",
// index: 5,
// data: null
// }
.mapFromData
is almost the same as .map
, except the function which it is passed also has access to the internal state data, and can thus transform values based on this data.
Example
const parserWithData = withData(letters.mapFromData(({result, data}) => ({
matchedValueWas: result,
internalDataWas: data
})));
parserWithData(42).run('hello');
// -> {
// isError: false,
// result: {
// matchedValueWas: "hello",
// internalDataWas: 42
// },
// index: 5,
// data: 42
// }
.chainFromData
is almost the same as .chain
, except the function which it is passed also has access to the internal state data, and can choose how parsing continues based on this data.
Example
const lettersThenSpace = sequenceOf([
letters,
char(' ')
]).map(x => x[0]);
const parser = withData(lettersThenSpace.chainFromData(({result, data}) => {
if (data.bypassNormalApproach) {
return digits;
}
return letters;
}));
parser({ bypassNormalApproach: false }).run('hello world');
// -> {
// isError: false,
// result: "world",
// index: 11,
// data: { bypassNormalApproach: false }
// }
parser({ bypassNormalApproach: true }).run('hello world');
// -> {
// isError: true,
// error: "ParseError (position 6): Expecting digits",
// index: 6,
// data: { bypassNormalApproach: true }
// }
.errorMap
is like .map but it transforms the error value. The function passed to .errorMap
gets an object the current error message (error
) , the index (index
) that parsing stopped at, and the data (data
) from this parsing session.
Example
const newParser = letters.errorMap(({error, index}) => `Old message was: [${error}] @ index ${index}`);
newParser.run('1234')
// -> {
// isError: true,
// error: "Old message was: [ParseError (position 0): Expecting letters] @ index 0",
// index: 0,
// data: null
// }
.errorChain
is almost the same as .chain
, except that it only runs if there is an error in the parsing state. This is a useful method when either trying to recover from errors, or for when a more specific error message should be constructed.
Example
const parser = digits.errorChain(({error, index, data}) => {
console.log('Recovering...');
return letters;
});
p.run('42');
// -> {
// isError: false,
// result: "42",
// index: 2,
// data: null
// }
p.run('hello');
// [console.log] Recovering...
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
s = parser.run('');
// [console.log] Recovering...
// -> {
// isError: true,
// error: "ParseError (position 0): Expecting letters",
// index: 0,
// data: null
// }
setData
takes anything that should be set as the internal state data, and returns a parser that will perform that side effect when the parser is run. This does not consume any input. If parsing is currently in an errored state, then the data will not be set.
Example
const parser = coroutine(run=> {
const name = run(letters);
if (name === 'Jim') {
run(setData('The name is Jim'));
}
return name;
});
parser.run('Jim');
// -> {
// isError: false,
// result: "Jim",
// index: 3,
// data: "The name is Jim"
// }
If dealing with any complex level of state - such as an object where individual keys will be updated or required, then it can be useful to create utility parsers to assist with updating the internal state data. One possible pattern that could be used is the reducer pattern, famed by redux:
Example
const createStateReducer = reducer => action => getData.chain(state => setData(reducer(state, action)));
const updateCounterState = createStateReducer((state = 0, action) => {
switch (action.type) {
case 'INC': {
return state + 1;
}
case 'DEC': {
return state - 1;
}
case 'ADD': {
return state + action.payload;
}
case 'RESET': {
return 0;
}
}
});
const parser = coroutine(run=>{
let count = run(updateCounterState({ type: 'RESET' }));
console.log(count);
run(updateCounterState({ type: 'INC' }));
run(updateCounterState({ type: 'INC' }));
run(updateCounterState({ type: 'DEC' }));
count = run(updateCounterState({ type: 'INC' }));
console.log(count);
return run(updateCounterState({ type: 'ADD', payload: 10 }));
});
parser.run('Parser is not looking at the text!');
// [console.log] 0
// [console.log] 2
// -> {
// isError: false,
// result: 12,
// index: 0,
// data: 12
// }
withData
takes a provided parser, and returns a function waiting for some state data to be set, and then returns a new parser. That parser, when run, ensures that the state data is set as the internal state data before the provided parser runs.
Example
const parserWithoutData = letters;
const parser = withData(parserWithoutData);
parser("hello world!").run('Jim');
// -> {
// isError: false,
// result: "Jim",
// index: 3,
// data: "hello world!"
// }
parserWithoutData.run('Jim');
// -> {
// isError: false,
// result: "Jim",
// index: 3,
// data: null
// }
mapData
takes a function that recieves and returns some state data, and transforms the internal state data using the function, without consuming any input.
Example
const parser = withData(mapData(s => s.toUpperCase()));
parser("hello world!").run('Jim');
// -> {
// isError: false,
// result: null,
// index: 0,
// data: "HELLO WORLD!"
// }
getData
is a parser that will always return what is contained in the internal state data, without consuming any input.
Example
const parser = withData(sequenceOf([
letters,
digits,
getData
]));
parser("hello world!").run('Jim1234');
// -> {
// isError: false,
// result: ["Jim", "1234", "hello world!"],
// index: 3,
// data: "hello world!"
// }
If dealing with any complex level of state - such as an object where individual keys will be updated or required, then it can be useful to create utility parsers to assist.
Example
const selectState = selectorFn => getData.map(selectorFn);
const parser = withData(coroutine(run=> {
// Here we can take or transform the state
const occupation = run(selectState(({job}) => job));
const initials = run(selectState(({firstName, lastName}) => `${firstName[0]}${lastName[0]}`));
console.log(`${initials}: ${occupation}`);
const first = run(letters);
const second = run(digits);
return `${second}${first}`;
}));
parser({
firstName: "Francis",
lastName: "Stokes",
job: "Developer"
}).run('Jim1234');
// [console.log] FS: Developer
// -> {
// isError: false,
// result: "1234Jim",
// index: 3,
// data: {
// firstName: "Francis",
// lastName: "Stokes",
// job: "Developer"
// }
// }
coroutine
takes a user provided parser function, to which is passed a run
function. Within the parser function, the user can run
other parsers, and get immediate access to their results.
coroutine
allows you to write parsers in a more imperative and sequential way - in much the same way async/await
allows you to write code with promises in a more sequential way.
Inside of the parser function, you can use all regular JavaScript language features, like loops, variable assignments, try/catch, and conditional statements. This makes it easy to write very powerful parsers using coroutine
, but on the other side it can lead to less readable, more complex code.
Debugging is also much easier, as breakpoints can be easily added, and values logged to the console after they have been parsed.
Example
const parser = coroutine(run => {
// Capture some letters and assign them to a variable
const name = run(letters);
// Capture a space
run(char(' '));
const age = run(digits.map(Number));
// Capture a space
run(char(' '));
if (age >= 18) {
run(str('is an adult'));
} else {
run(str('is a child'));
}
return { name, age };
});
parser.run('Jim 19 is an adult');
// -> {
// isError: false,
// result: { name: "Jim", age: 19 },
// index: 18,
// data: null
// }
parser.run('Jim 17 is an adult');
// -> {
// isError: true,
// error: "ParseError (position 7): Expecting string 'is a child', got 'is an adul...'",
// index: 7,
// data: null
// }
char
takes a character and returns a parser that matches that character exactly one time.
Example
char ('h').run('hello')
// -> {
// isError: false,
// result: "h",
// index: 1,
// data: null
// }
anyChar
matches exactly one utf-8 character.
Example
anyChar.run('a')
// -> {
// isError: false,
// result: "a",
// index: 1,
// data: null
// }
anyChar.run('😉')
// -> {
// isError: false,
// result: "😉",
// index: 4,
// data: null
// }
str
takes a string and returns a parser that matches that string exactly one time.
Example
str('hello').run('hello world')
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
digit
is a parser that matches exactly one numerical digit /[0-9]/
.
Example
digit.run('99 bottles of beer on the wall')
// -> {
// isError: false,
// result: "9",
// index: 1,
// data: null
// }
digits
is a parser that matches one or more numerical digit /[0-9]/
.
Example
digits.run('99 bottles of beer on the wall')
// -> {
// isError: false,
// result: "99",
// index: 2,
// data: null
// }
letter
is a parser that matches exactly one alphabetical letter /[a-zA-Z]/
.
Example
letter.run('hello world')
// -> {
// isError: false,
// result: "h",
// index: 1,
// data: null
// }
letters
is a parser that matches one or more alphabetical letter /[a-zA-Z]/
.
Example
letters.run('hello world')
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
whitespace
is a parser that matches one or more whitespace characters.
Example
const newParser = sequenceOf ([
str ('hello'),
whitespace,
str ('world')
]);
newParser.run('hello world')
// -> {
// isError: false,
// result: [ "hello", " ", "world" ],
// index: 21,
// data: null
// }
newParser.run('helloworld')
// -> {
// isError: true,
// error: "ParseError 'many1' (position 5): Expecting to match at least one value",
// index: 5,
// data: null
// }
optionalWhitespace
is a parser that matches zero or more whitespace characters.
Example
const newParser = sequenceOf ([
str ('hello'),
optionalWhitespace,
str ('world')
]);
newParser.run('hello world')
// -> {
// isError: false,
// result: [ "hello", " ", "world" ],
// index: 21,
// data: null
// }
newParser.run('helloworld')
// -> {
// isError: false,
// result: [ "hello", "", "world" ],
// index: 10,
// data: null
// }
peek
matches exactly one numerical byte without consuming any input.
Example
peek.run('hello world')
// -> {
// isError: false,
// result: 104,
// index: 0,
// data: null
// }
sequenceOf([
str('hello'),
peek
]).run('hello world')
// -> {
// isError: false,
// result: [ "hello", 32 ],
// index: 5,
// data: null
// }
anyOfString
takes a string and returns a parser that matches exactly one character from that string.
Example
anyOfString('aeiou').run('unusual string')
// -> {
// isError: false,
// result: "u",
// index: 1,
// data: null
// }
regex
takes a RegExp and returns a parser that matches as many characters as the RegExp matches.
Example
regex(/^[hH][aeiou].{2}o/).run('hello world')
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
sequenceOf
takes an array of parsers, and returns a new parser that matches each of them sequentially, collecting up the results into an array.
Example
const newParser = sequenceOf ([
str ('he'),
letters,
char (' '),
str ('world'),
])
newParser.run('hello world')
// -> {
// isError: false,
// result: [ "he", "llo", " ", "world" ],
// index: 11,
// data: null
// }
namedSequenceOf
takes an array of string/parser pairs, and returns a new parser that matches each of them sequentially, collecting up the results into an object where the key is the string in the pair.
A pair is just an array in the form: [string, parser]
Example
const newParser = namedSequenceOf ([
['firstPart', str ('he')],
['secondPart', letters],
['thirdPart', char (' ')],
['forthPart', str ('world')],
])
newParser.run('hello world')
// -> {
// isError: false,
// result: {
// firstPart: "he",
// secondPart: "llo",
// thirdPart: " ",
// forthPart: "world"
// },
// index: 11,
// data: null
// }
choice
takes an array of parsers, and returns a new parser that tries to match each one of them sequentially, and returns the first match. If choice
fails, then it returns the error message of the parser that matched the most from the string.
Example
const newParser = choice ([
digit,
char ('!'),
str ('hello'),
str ('pineapple')
])
newParser.run('hello world')
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
lookAhead
takes look ahead parser, and returns a new parser that matches using the look ahead parser, but without consuming input.
Example
const newParser = sequenceOf ([
str ('hello '),
lookAhead (str ('world')),
str ('wor')
]);
newParser.run('hello world')
// -> {
// isError: false,
// result: [ "hello ", "world", "wor" ],
// index: 9,
// data: null
// }
sepBy
takes two parsers - a separator parser and a value parser - and returns a new parser that matches zero or more values from the value parser that are separated by values of the separator parser. Because it will match zero or more values, this parser will fail if a value is followed by a separator but NOT another value. If there's no value, the result will be an empty array, not failure.
Example
const newParser = sepBy (char (',')) (letters)
newParser.run('some,comma,separated,words')
// -> {
// isError: false,
// result: [ "some", "comma", "separated", "words" ],
// index: 26,
// data: null
// }
newParser.run('')
// -> {
// isError: false,
// result: [],
// index: 0,
// data: null
// }
newParser.run('12345')
// -> {
// isError: false,
// result: [],
// index: 0,
// data: null
// }
sepBy1
is the same as sepBy
, except that it matches one or more occurence.
Example
const newParser = sepBy1 (char (',')) (letters)
newParser.run('some,comma,separated,words')
// -> {
// isError: false,
// result: [ "some", "comma", "separated", "words" ],
// index: 26,
// data: null
// }
newParser.run('1,2,3')
// -> {
// isError: true,
// error: "ParseError 'sepBy1' (position 0): Expecting to match at least one separated value",
// index: 0,
// data: null
// }
exactly
takes a positive number and returns a function. That function takes a parser and returns a new parser which matches the given parser the specified number of times.
Example
const newParser = exactly (4)(letter)
newParser.run('abcdef')
// -> {
// isError: false,
// result: [ "a", "b", "c", "d" ],
// index: 4,
// data: null
// }
newParser.run('abc')
// -> {
// isError: true,
// error: 'ParseError (position 0): Expecting 4 letter, but got end of input.',
// index: 0,
// data: null
// }
newParser.run('12345')
// -> {
// isError: true,
// error: 'ParseError (position 0): Expecting 4 letter, got '1'',
// index: 0,
// data: null
// }
many
takes a parser and returns a new parser which matches that parser zero or more times. Because it will match zero or more values, this parser will always match, resulting in an empty array in the zero case.
Example
const newParser = many (str ('abc'))
newParser.run('abcabcabcabc')
// -> {
// isError: false,
// result: [ "abc", "abc", "abc", "abc" ],
// index: 12,
// data: null
// }
newParser.run('')
// -> {
// isError: false,
// result: [],
// index: 0,
// data: null
// }
newParser.run('12345')
// -> {
// isError: false,
// result: [],
// index: 0,
// data: null
// }
many1
is the same as many
, except that it matches one or more occurence.
Example
const newParser = many1 (str ('abc'))
newParser.run('abcabcabcabc')
// -> {
// isError: false,
// result: [ "abc", "abc", "abc", "abc" ],
// index: 12,
// data: null
// }
newParser.run('')
// -> {
// isError: true,
// error: "ParseError 'many1' (position 0): Expecting to match at least one value",
// index: 0,
// data: null
// }
newParser.run('12345')
// -> {
// isError: true,
// error: "ParseError 'many1' (position 0): Expecting to match at least one value",
// index: 0,
// data: null
// }
between
takes 3 parsers, a left parser, a right parser, and a value parser, returning a new parser that matches a value matched by the value parser, between values matched by the left parser and the right parser.
This parser can easily be partially applied with char ('(')
and char (')')
to create a betweenRoundBrackets
parser, for example.
Example
const newParser = between (char ('<')) (char ('>')) (letters);
newParser.run('<hello>')
// -> {
// isError: false,
// result: "hello",
// index: 7,
// data: null
// }
const betweenRoundBrackets = between (char ('(')) (char (')'));
betweenRoundBrackets (many (letters)).run('(hello world)')
// -> {
// isError: true,
// error: "ParseError (position 6): Expecting character ')', got ' '",
// index: 6,
// data: null
// }
Note: Between 2.x and 3.x, the definition of the everythingUntil
has changed. In 3.x, what was previously everythingUntil
is now everyCharUntil
.
everythingUntil
takes a termination parser and returns a new parser which matches every possible numerical byte up until a value is matched by the termination parser. When a value is matched by the termination parser, it is not "consumed".
Example
everythingUntil (char ('.')).run('This is a sentence.This is another sentence')
// -> {
// isError: false,
// result: [84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 101, 110, 116, 101, 110, 99, 101],
// index: 18,
// data: null
// }
// termination parser doesn't consume the termination value
const newParser = sequenceOf ([
everythingUntil (char ('.')),
str ('This is another sentence')
]);
newParser.run('This is a sentence.This is another sentence')
// -> {
// isError: true,
// error: "ParseError (position 18): Expecting string 'This is another sentence', got '.This is another sentenc...'",
// index: 18,
// data: null
// }
everyCharUntil
takes a termination parser and returns a new parser which matches every possible character up until a value is matched by the termination parser. When a value is matched by the termination parser, it is not "consumed".
Example
everyCharUntil (char ('.')).run('This is a sentence.This is another sentence')
// -> {
// isError: false,
// result: 'This is a sentence',
// index: 18,
// data: null
// }
// termination parser doesn't consume the termination value
const newParser = sequenceOf ([
everyCharUntil (char ('.')),
str ('This is another sentence')
]);
newParser.run('This is a sentence.This is another sentence')
// -> {
// isError: true,
// error: "ParseError (position 18): Expecting string 'This is another sentence', got '.This is another sentenc...'",
// index: 18,
// data: null
// }
Note: Between 2.x and 3.x, the definition of the anythingExcept
has changed. In 3.x, what was previously anythingExcept
is now anyCharExcept
.
anythingExcept
takes a exception parser and returns a new parser which matches exactly one numerical byte, if it is not matched by the exception parser.
Example
anythingExcept (char ('.')).run('This is a sentence.')
// -> {
// isError: false,
// result: 84,
// index: 1,
// data: null
// }
const manyExceptDot = many (anythingExcept (char ('.')))
manyExceptDot.run('This is a sentence.')
// -> {
// isError: false,
// result: [84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 101, 110, 116, 101, 110, 99, 101, 46],
// index: 18,
// data: null
// }
anyCharExcept
takes a exception parser and returns a new parser which matches exactly one character, if it is not matched by the exception parser.
Example
anyCharExcept (char ('.')).run('This is a sentence.')
// -> {
// isError: false,
// result: 'T',
// index: 1,
// data: null
// }
const manyExceptDot = many (anyCharExcept (char ('.')))
manyExceptDot.run('This is a sentence.')
// -> {
// isError: false,
// result: ['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 'e', 'n', 't', 'e', 'n', 'c', 'e'],
// index: 18,
// data: null
// }
possibly
takes an attempt parser and returns a new parser which tries to match using the attempt parser. If it is unsuccessful, it returns a null value and does not "consume" any input.
Example
const newParser = sequenceOf ([
possibly (str ('Not Here')),
str ('Yep I am here')
]);
newParser.run('Yep I am here')
// -> {
// isError: false,
// result: [ null, "Yep I am here" ],
// index: 13,
// data: null
// }
startOfInput
is a parser that only succeeds when the parser is at the beginning of the input.
Example
const mustBeginWithHeading = sequenceOf([
startOfInput,
str("# ")
]);
const newParser = between(mustBeginWithHeading)(endOfInput)(everyCharUntil(endOfInput));
newParser.run('# Heading');
// -> {
// isError: false,
// result: "# Heading",
// index: 9,
// data: null
// }
newParser.run(' # Heading');
// -> {
// isError: true,
// error: "ParseError (position 0): Expecting string '# ', got ' #...'",
// index: 0,
// data: null
// }
endOfInput
is a parser that only succeeds when there is no more input to be parsed.
Example
const newParser = sequenceOf ([
str ('abc'),
endOfInput
]);
newParser.run('abc')
// -> {
// isError: false,
// result: [ "abc", null ],
// index: 3,
// data: null
// }
newParser.run('')
// -> {
// isError: true,
// error: "ParseError (position 0): Expecting string 'abc', but got end of input.",
// index: 0,
// data: null
// }
skip
takes a skip parser and returns a new parser which matches using the skip parser, but doesn't return its value, but instead the value of whatever came before it.
Example
const newParser = pipeParsers ([
str ('abc'),
str('123'),
skip (str ('def'))
])
newParser.run('abc123def')
// -> {
// isError: false,
// result: "123",
// index: 9,
// data: null
// }
pipeParsers
takes an array of parsers and composes them left to right, so each parsers return value is passed into the next one in the chain. The result is a new parser that, when run, yields the result of the final parser in the chain.
Example
const newParser = pipeParsers ([
str ('hello'),
char (' '),
str ('world')
]);
newParser.run('hello world')
// -> {
// isError: false,
// result: "world",
// index: 11,
// data: null
// }
composeParsers
takes an array of parsers and composes them right to left, so each parsers return value is passed into the next one in the chain. The result is a new parser that, when run, yields the result of the final parser in the chain.
Example
const newParser = composeParsers ([
str ('world'),
char (' '),
str ('hello')
]);
newParser.run('hello world')
// -> {
// isError: false,
// result: "world",
// index: 11,
// data: null
// }
takeRight
takes two parsers, left and right, and returns a new parser that first matches the left, then the right, and keeps the value matched by the right.
Example
const newParser = takeRight (str ('hello ')) (str ('world'))
newParser.run('hello world')
// -> {
// isError: false,
// result: "world",
// index: 11,
// data: null
// }
takeLeft
takes two parsers, left and right, and returns a new parser that first matches the left, then the right, and keeps the value matched by the left.
Example
const newParser = takeLeft (str ('hello ')) (str ('world'))
newParser.run('hello world')
// -> {
// isError: false,
// result: "hello",
// index: 11,
// data: null
// }
recursiveParser
takes a function that returns a parser (a thunk), and returns that same parser. This is needed in order to create recursive parsers because JavaScript is not a "lazy" language.
In the following example both the value
parser and the matchArray
parser are defined in terms of each other, so one must be one must be defined using recursiveParser
.
Example
const value = recursiveParser (() => choice ([
matchNum,
matchStr,
matchArray
]));
const betweenSquareBrackets = between (char ('[')) (char (']'));
const commaSeparated = sepBy (char (','));
const spaceSeparated = sepBy (char (' '));
const matchNum = digits;
const matchStr = letters;
const matchArray = betweenSquareBrackets (commaSeparated (value));
spaceSeparated(value).run('abc 123 [42,somethingelse] 45')
// -> {
// isError: false,
// result: [ "abc", "123", [ "42", "somethingelse" ], "45" ],
// index: 29,
// data: null
// }
tapParser
takes a function and returns a parser that does nothing and consumes no input, but runs the provided function on the last parsed value. This is intended as a debugging tool to see the state of parsing at any point in a sequential operation like sequenceOf
or pipeParsers
.
Example
const newParser = sequenceOf ([
letters,
tapParser(console.log),
char (' '),
letters
]);
newParser.run('hello world')
// -> [console.log]: Object {isError: false, error: null, target: "hello world", data: null, index: 5, …}
// -> {
// isError: false,
// result: [ "hello", "hello", " ", "world" ],
// index: 11,
// data: null
// }
decide
takes a function that recieves the last matched value and returns a new parser. It's important that the function always returns a parser. If a valid one cannot be selected, you can always use fail.
decide
allows an author to create a context-sensitive grammar.
Example
const newParser = sequenceOf ([
takeLeft (letters) (char (' ')),
decide (v => {
switch (v) {
case 'asLetters': return letters;
case 'asDigits': return digits;
default: return fail(`Unrecognised signifier '${v}'`);
}
})
]);
newParser.run('asDigits 1234')
// -> {
// isError: false,
// result: [ "asDigits", "1234" ],
// index: 13,
// data: null
// }
newParser.run('asLetters hello')
// -> {
// isError: false,
// result: [ "asLetters", "hello" ],
// index: 15,
// data: null
// }
newParser.run('asPineapple wayoh')
// -> {
// isError: true,
// error: "Unrecognised signifier 'asPineapple'",
// index: 12,
// data: null
// }
mapTo
takes a function and returns a parser does not consume input, but instead runs the provided function on the last matched value, and set that as the new last matched value. This function can be used to apply structure or transform the values as they are being parsed.
Example
const newParser = pipeParsers([
letters,
mapTo(x => {
return {
matchType: 'string',
value: x
}
})
]);
newParser.run('hello world')
// -> {
// isError: false,
// result: {
// matchType: "string",
// value: "hello"
// },
// index: 5,
// data: null
// }
errorMapTo
is like mapTo but it transforms the error value. The function passed to errorMapTo
gets the current error message as its first argument and the index that parsing stopped at as the second.
Example
const newParser = pipeParsers([
letters,
errorMapTo((message, index) => `Old message was: [${message}] @ index ${index}`)
]);
newParser.run('1234')
// -> {
// isError: true,
// error: "Old message was: [ParseError (position 0): Expecting letters] @ index 0",
// index: 0,
// data: null
// }
fail
takes an error message string and returns a parser that always fails with the provided error message.
Example
fail('Nope').run('hello world')
// -> {
// isError: true,
// error: "Nope",
// index: 0,
// data: null
// }
succeedWith
takes an value and returns a parser that always matches that value and does not consume any input.
Example
succeedWith ('anything').run('hello world')
// -> {
// isError: false,
// result: "anything",
// data: null
// index: 0,
// }
either
takes a parser and returns a parser that will always succeed, but the captured value will be an Either, indicating success or failure.
Example
either(fail('nope!')).run('hello world')
// -> {
// isError: false,
// result: {
// isError: true,
// value: "nope!"
// },
// index: 0,
// data: null
// }
toPromise
converts a ParserResult
(what is returned from .run
) into a Promise
.
Example
const parser = str('hello');
toPromise(parser.run('hello world'))
.then(console.log)
.catch(({error, index, data}) => {
console.log(error);
console.log(index);
console.log(data);
});
// -> [console.log] hello
toPromise(parser.run('goodbye world'))
.then(console.log)
.catch(({error, index, data}) => {
console.log('Error!');
console.log(error);
console.log(index);
console.log(data);
});
// -> [console.log] Error!
// -> [console.log] ParseError (position 0): Expecting string 'hello', got 'goodb...'
// -> [console.log] 0
// -> [console.log] null
toValue
converts a ParserResult
(what is returned from .run
) into a regular value, and throws an error if the result contained one.
Example
const result = str ('hello').run('hello worbackgroiund<hAld');
try {
const value = toValue(result);
console.log(value);
// -> 'hello'
} catch (parseError) {
console.error(parseError.message)
}
parse
takes a parser and input (which may be a string
, TypedArray
, ArrayBuffer
, or DataView
), and returns the result of parsing the input using the parser.
Example
parse (str ('hello')) ('hello')
// -> {
// isError: false,
// result: "hello",
// index: 5,
// data: null
// }
If you're parsing a programming language, a configuration, or anything of sufficient complexity, it's likely that you'll need to define some parsers in terms of each other. You might want to do something like:
const value = choice ([
matchNum,
matchStr,
matchArray
]);
const betweenSquareBrackets = between (char ('[')) (char (']'));
const commaSeparated = sepBy (char (','));
const matchNum = digits;
const matchStr = letters;
const matchArray = betweenSquareBrackets (commaSeparated (value));
In this example, we are trying to define value
in terms of matchArray
, and matchArray
in terms of value
. This is problematic in a language like JavaScript because it is what's known as an "eager language". Because the definition of value
is a function call to choice
, the arguments of choice
must be fully evaluated, and of course none of them are yet. If we just move the definition below matchNum
, matchStr
, and matchArray
, we'll have the same problem with value
not being defined before matchArray
wants to use it.
We can get around JavaScript's eagerness by using recursiveParser, which takes a function that returns a parser:
const value = recursiveParser(() => choice ([
matchNum,
matchStr,
matchArray
]));
const betweenSquareBrackets = between (char ('[')) (char (']'));
const commaSeparated = sepBy (char (','));
const matchNum = digits;
const matchStr = letters;
const matchArray = betweenSquareBrackets (commaSeparated (value));
This library implements the following Fantasy Land (v3) interfaces:
Every parser, or parser made from composing parsers has a .of
, .map
, .chain
, and .ap
method.
Parser.of(42)
// is equivalent to
succeedWith (42)
letters.map (fn)
// is equivalent to
pipeParsers ([ letters, mapTo (fn) ])
letters.chain (x => someOtherParser)
// is equivalent to
pipeParsers ([ letters, decide (x => someOtherParser) ])
letters.ap (Parser.of (fn))
// is equivalent to
pipeParsers ([
sequenceOf ([ succeedWith (fn), letters ]),
mapTo (([fn, x]) => fn(x))
]);
The name is also derived from parsec, which in astronomical terms is an "astronomical unit [that] subtends an angle of one arcsecond".