Closed TwitchBronBron closed 5 years ago
Hiya @TwitchBronBron ! Thanks for the detailed description – I really appreciate it!
There's currently no ability to get the list of lexer or parser errors. Not for any technical reason though – I just hadn't thought of it! The way I see it, there's two options:
The root-export brs
acts like an event emitter. It'd emit an error
event with a payload equivalent to what you listed above. This, coincidentally, would solve another problem I've been struggling with: how to effectively decouple output to stderr
from the runtime!
Similar options include the lexer and parser individually acting like event emitters (handy if you care about errors only from one source and not the other) or exposing a brs.events
property as an emitter.
The root-export brs
adds a getErrors()
method that returns a list of errors. Nice and simple, though I'll need to expose a brs.clearErrors()
to keep things from accumulating indefinitely.
I'm leaning pretty heavily towards something event-emitter-based. You're working on a VSCode extension, right? I'm not sure which option would work better, or if an event-emitter would even be possible within the confines of what VSCode gives you. Got any preferences?
Before I talk about 1 or 2, I'd like to recommend a third option?
Here's how I'm planning on writing my language server. On startup, I have to lex and parse every file in the "project". As such, I'd like to do that as quickly as possible, so running multiple operations each in their own async context allows me to do CPU-bound operations for file1 while file2 is still loading on a node background IO thread.
So assume for a moment that your lexer and parser were implemented as classes and consider this example:
import { Lexer, Parse } from 'brs';
async function processFiles(filePaths: string[]) {
let errorsByFile = <{ [filePath: string]: string[] }>{};
//run each file path async, so we can maximize throughput
//by processing one file while another is waiting on IO, etc...
await Promise.all(filePaths.map(async (filePath) => {
//await on IO-bound operation (allows another file to
//do CPU-bound work while we're waiting)
let text = await getFileAsTextSomehow(filePath);
//make a new instance of a Lexer
var lexer = new Lexer();
let tokens = lexer.scan(text);
//the lexer instance stores all errors about its file on its own `errors` property
let lexErrors = lexer.errors;
//make a new instance of a Parser
let parser = new Parser();
let statements = parser.parse(tokens);
//the parser instance stores all errors about its file on its own `errors` property
let parseErrors = parser.errors;
errorsByFile[filePath] = [...lexErrors, ...parseErrors];
}));
//do something with any file that has errors
}
Option1 and option2 would work as well, but they would mean a little extra thinking to make sure to grab errors immediately after doing the lexing/parsing, and before going into another async context. Something like this might cause unintended side-effects.
await Promise.all(filePaths.map(async (filePath) => {
let text = await getFileAsTextSomehow(filePath);
let tokens = Lexer.scan(text);
//await here to load a config from disc
let appConfig = await getAppConfig();
if (appConfig.showErrors) {
//these will probably be some other file's errors because we awaited
let lexErrors = brs.getErrors();
}
}));
You can convert to classes and still support backwards compatibility by making your static methods instantiate the classes behind the scenes. Something like this:
//Lexer.ts
export function scan(text){
var lexer = new Lexer();
var tokens = lexer.scan(text);
return tokens;
}
Here are my thoughts about your other items:
This would be fine as long as listeners have a way of knowing that all events have been received. Since your parser and lexer are synchronous, I would assume the event emitter would be emitting events synchronously as well, meaning something like this would work. If that's the case, then I'm ok with this approach.
let lexErrors = [];
Lexer.on('error', (err) => {
lexErrors.push(err);
});
let tokens = Lexer.Scan(text);
//since the event emitter methods are called synchronously,
//we know that all lex errors have been received at this point
A potential downside to this approach is performance. Calling a function on every error could add up if there are a lot of errors and/or many files. With option2 and option3, there would only be one method call at the end of the whole process (brs.getErrors()
or lexer.getErrors()
. Since your lexer and parser run synchronously, there isn't much benefit in getting errors immediately when they are encountered, because in most cases, the main thread is blocked until you're done parsing anyway.
This would be acceptable also, and since the main entry points in your app are all static, this lines up with that same approach.
Since you are adding the errors as new functionality, there's no expectation that the errors array should exist across multiple lex/parse calls. So you could easily document that errors are cleared on every lex or parse call, so if you want errors, grab a copy before you run another operation.
So something like this would work for auto-cleanup.
//Lexer.ts
import * as brs from 'brs';
export function scan(text: string){
brs.clearErrors();
return getTokensSomehow();
}
//Parser.ts
import * as brs from 'brs';
export function parse(text: string){
brs.clearErrors();
return getParserResultsSomehow();
}
Consumers of your API would do something like this:
//MyApp.ts
import * as brs from 'brs';
var errors = [];
let tokens = Lexer.scan(someText);
//capture errors before another scan
errors = [...errors, ...brs.getErrors()];
let moreTokens = Lexer.scan(otherText);
//add the new errors to our list of existing errors
errors = [...errors, brs.getErrors()];
Ultimately it's your call. Any of these approaches will work fine within the VSCode extension. :D
I just thought of another option. You probably don't want to break backwards compatibility, so you would need to expose a separate method for this:
//Lexer.ts
export function getScanResults(text: string){
return {
tokens: getTokensSomehow(text),
errors: getErrorsSomehow()
}
}
So then as a consumer of this, I would call your new method, which gets me everything I need
//MyApp.ts
let result = Lexer.getScanResults(fileContents);
let tokens = result.tokens;
let errors = result.errors;
I don't know how you would want to implement this in your library, but from the outside, it's much cleaner and you get everything you need in one place. This also allows you to add more result-like things to the return object in the future. I still like option 3 the best (class instances), but figured I would share this as another option.
Thanks for the ideas dude! I'd briefly thought of something like 4 but shied away from it because of the amount of changes internally that'd be required. But you're right – it's probably a good idea to have things packaged up that way!
Splitting errors to be tracked by the lexer and parser separately definitely makes sense as well. I'll see if I get time to hack around on this over the weekend and see what feels right.
Cool. Let me know if you want to talk through anything else, or if there's something I can do to help.
I'm using the
Lexer
andParser
classes to try and validate a brightscript file. How can I get a list of parse errors?I'm doing the following in TypeScript:
Problem is, it just seems to eat any errors it encounters. For example, based on this line in
Parser.ts
, there should be a parse error thrown whenever you declare a sub with anas
clause.Here's a very simple brightscript file that should have at least the above mentioned parse error
I don't get any exceptions thrown...the statements array contains a single entry, representing the print line. If I remove the
as string
, then the statements array has one entry at the top level represneting theSayHello
function (as expected).What I am hoping to get is a list of parse errors, and their line numbers (which in this case would be 1 parse error, something to the effect of
Is this functionality available? If not, do you think it's do-able?