Closed ratmice closed 11 months ago
So couple of things seem worth noting this morning,
YaccGrammarConstructionFailure
is kind of long, but seemed to convey the meaning?enum warnings + errors
if we want to include warnings into the same channel, or a second channel for those (which seems somewhat unfortunate).Very interesting! I think this PR has a slightly different aim to some of the stuff in #428, so let me see if I can wrap my pea brain around it!
It seems to me the main idea here is that the consumer of errors are "remote" (at least, potentially in another thread) so we use mpsc
to ship existing error structs to those consumer(s)? One advantage of that approach over the "lots of functions" approach in https://github.com/softdevteam/grmtools/pull/428#issuecomment-1862375650 is that we get to make use of the existing enum
s, and that probably makes extending the API less cumbersome in the future. This feels like a better approach to me than "lots of functions". I also think I like the general ASTBuilder
approach.
That said, reading the code made me realise that perhaps baking in mpsc
stuff isn't quite what we want to do. Because mpsc::channel
is slightly clumsy to use (in the sense of having errors etc.) that does muddle the code a bit. And while mpsc::channel
probably works OK for many users (even those not using threads), it feels overkill for some users (e.g. nimbleparse), while for others they'll probably have to translate from channels to some other type (e.g. LSP libraries).
So what I'm now wondering is whether we can break this into perhaps 3 bits:
ASTBuilder
and friends feel like they're probably an API improvement full stop, irrespective of error reporting. We can probably get those in quite quickly on their own?trait ParseErrorReporting{
fn parse_error(YaccGrammarError) -> Result<(), Box<dyn Error>>;
}
channel
implementation and/or "nice pretty printer" implementation of ParseErrorReporting
as standard?Please feel free to tell me that I have missed something important, as you've thought more deeply about this than I have!
Yeah, this definitely has a different aim, as this would mostly be under the hood of something like #428 being the mechanism by which errors could be reported promptly before parsing has fully finished.
I tend to agree about mpsc::channel being clumsy, overkill and unfortunately producing an error on send
. I could kind of live with that last issue due to the existence of the ConstructionFailure
, and mpsc
errors being able to be wrapped up in that same type, but it definitely isn't ideal.
My biggest gripe about the interface is something like the following,
let x = ASTBuilder::new().grammar_src("not valid .y source");
let y = x.error_receiver();
let ast_validity = x.build();
assert_eq!(ast_validity.errors(), vec![])
Because they call to error_receiver
took the receiver that errors()
is implemented in terms of.
What I would like to do is to try and make build()
return something else, ASTValidity
, which merely has bool
, and an GrammarAST
and no fn errors()
, or fn warnings()
and move all that to a trait that gets passed to the builder.
I think we can still implement all the existing API on top of that (including ASTWithValidityInfo
), while dropping the need for mpsc.
Anyhow, I would like to play with that approach before we try to get ASTBuilder
in tree I think, if that sounds okay.
mpsc::channel probably works OK for many users (even those not using threads), it feels overkill for some users (e.g. nimbleparse), while for others they'll probably have to translate from channels to some other type (e.g. LSP libraries).
One thing I liked about this approach is that things like nimbleparse can (and actually do currently on the branch) continue using the existing code-path where it returns a Vec<YaccGrammarError>
, oblivious that there is an mpsc::receiver
under the hood via via error_receiver.iter().collect()
. In that sense, dealing with mpsc receiver is entirely optional for those what want promptness.
I still think we can get this same type of behavior with a trait instead of mpsc
, by slightly different means though.
- We could then provide a channel implementation and/or "nice pretty printer" implementation of ParseErrorReporting as standard?
I currently have one "nice pretty printer", but it depends on an external library that is partially why I have hesitated to include it in any of these PR's so far (I really appriate how few dependencies grmtools has). I assume at they very least it should have a feature which could be at the very least disabled if not disabled by default, any preference here?
Please feel free to tell me that I have missed something important, as you've thought more deeply about this than I have!
I think the only thing that struck me was the above about nimbleparse not really needing to be affected by this kind of change, (or a similar one using a trait). Another bothersome thing about mpsc
is to really utilize it from another thread it seems likely that other thread also wants the grammar source, which it wouldn't inherently have through the channel alone... I think it would be nice if we collect all the stuff needed to make nice errors together at the same level of abstraction. The ParseErrorReporting
also needs a way to get it. Otherwise, spot on.
What I would like to do is to try and make build() return something else, ASTValidity, which merely has bool, and an GrammarAST and no fn errors(), or fn warnings() and move all that to a trait that gets passed to the builder. ... I still think we can get this same type of behavior with a trait instead of mpsc, by slightly different means though.
What I'm now wondering is whether we make the trait a type parameter. So, for example, we might have something like:
trait ASTErrorReporter {
fn error(...);
fn warning(...);
}
struct ASTBuilder<R: ASTErrorReporter> {
reporter: ASTErrorReporter,
any_error_found: bool,
...
}
impl<R: ASTErrorReporter> for ASTBuilder<R> {
fn blah(...) {
if (...) { self.reporter.error(...); self.any_error_found = true; }
}
fn build(self) -> Result<(), ()> {
if self.any_error_found { Err(()) } else { Ok(()) }
}
}
and then we can provide, at least, one impl along the lines of:
struct NullASTErrorReporter;
impl ASTErrorReporter for NullASTErrorReport {
fn error(...) { }
fn warning(...) { }
}
as well as, perhaps, a simple text error reporter.
A neat thing about this general idea is that people can drop in whatever sort of error reporting they might want (mpsc
channels, text, or nothing!) and they pay a performance price proportional to that. For example, if you use NullASTErrorReporter
you can be fairly hopeful that the optimiser will (in release mode) remove the costs of error reporting entirely. But, even if you do that, build
still returns Err
, so we don't allow complete nonsense to appear to succeed.
[BTW: I don't know if we need to split apart error
and warning
. It's just an example!]
I'll definitely give that a try, I had a lot of trouble trying to do so in CTBuilder
, which is why I resorted to dyn Trait
s there, but I think that was mostly to do with all the parameters we need for conflicts
causing bounds requirements on LRLexTypes::StorageT
, which hopefully won't come into play working with just warnings and errors, since none of them involve that.
I'd be fine with starting simple, and doing PRs for where it's easiest, and then trying the big nasty cases later.
One wrinkle I hadn't considered yet, when switching the build
method to return the new type is how to go from the new type to YaccGrammar
. for ASTWithValidityInfo
we curently have new_from_ast_with_validity_info Unfortunately, that function returns errors the new type won't have. So we can't just do the minor change of changing that ast_validation
parameter to a trait.
I suppose my inclination would be to just leave that API unchanged. With the equivalent for the new type becoming part of the builder chain. Like we could have something like ast_builder.build_ast().build_grammar()
where build_grammar
is the replacement for new_from_ast_with_validity_info
. Nothing particularly difficult, but still I'm curious if you have thoughts on how we want to organize this aspect.
One wrinkle I hadn't considered yet, when switching the build method to return the new type is how to go from the new type to YaccGrammar
Oops, my example didn't do that bit correctly! So that should have been:
trait ASTErrorReporter {
fn error(...);
fn warning(...);
}
struct ASTBuilder<R: ASTErrorReporter> {
reporter: ASTErrorReporter,
any_error_found: bool,
...
}
impl<R: ASTErrorReporter> for ASTBuilder<R> {
fn blah(...) {
if (...) { self.reporter.error(...); self.any_error_found = true; }
}
fn build(self) -> Result<AST, ()> {
if self.any_error_found { Err(()) } else { Ok(...) }
}
}
On that basis, I'm not quite sure I see what problem there might be with the various build
functions we have returning a specific type (AST
/YaccGrammar
/etc.) but maybe I'm missing out on something!
On that basis, I'm not quite sure I see what problem there might be with the various build functions we have returning a specific type (AST/YaccGrammar/etc.) but maybe I'm missing out on something!
It really isn't much of a problem, I'm mostly just worring about the ergonomics, due to paradox of choice from an accumulating number of YaccGrammar::new_*
functions.
It is unfortunately a little more complex than your example, currently we have something like:
// In AST
struct ASTWithValidityInfo {
ast: AST, // May contain a partial invalid AST when errors is non-empty.
errors: Vec<YaccGrammarError>,
warnings: Vec<YaccGrammarWarning>,
}
// Maintains the invariant that it returns Err(errors) if errs.len() > 0
fn new_from_ast_with_validity_info(ast: &ASTWithValidityInfo) -> Result<YaccGrammar, YaccGrammarError>
When we sever ownership of errors/warnings of the AST I was imagining we'd have a different structure like
struct ASTValidity{
ast: AST, // May contain a partial invalid AST when errors are found.
any_errors_found: bool,
}
// In YaccGrammar
// Maintains the invariant that it returns Err(()) if any_errors_found == true
fn new_from_valid_ast(ast: &ASTValidity) -> Result<YaccGrammar, ()>
The main thing i'm worried about here is the ergonomics since we are accumulating quite a number of variations of YaccGrammar::new_*
functions. I don't think anything currently uses all of the flexibility that this affords, e.g. nimbleparse_lsp uses the AST
from this, for formatting it's conflict errors, (but we only run checks for conflicts on the grammar after a valid AST).
So the main thought was that if we had ASTBuilder::build_ast() -> ASTValidity
, and ASTValidity::build_grammar() -> Result<YaccGrammar, ()>
, it seemed like a way to avoid an additional pub YaccGrammar::new_*
function, and it and allows having string -> AST -> Grammar
all within one call chain.
Sorry if this was a bit long.
Edit: I wonder if perhaps renaming what i've been calling ASTValidity
to something like GrammarBuilder
,
as in this case it would be kind of both an AST and a builder for grammars :shrug:
so you could have something like ASTBuilder::new().grammar_src(...).grammar_builder().build()
I'm not sure I like it as it loses the idea that it's just a wrapper around an AST, but it seemed worth considering
So the main thought was that if we had ASTBuilder::build_ast() -> ASTValidity, and ASTValidity::buildgrammar() -> Result<YaccGrammar, ()>, it seemed like a way to avoid an additional pub YaccGrammar::new* function, and it and allows having string -> AST -> Grammar all within one call chain.
I agree that this general idea (and GrammarBuilder
is probably a good name!) is the way forward.
Alright, so i'm going to do one more of these big branches, just to make sure everything works end-to-end. Then i'll try and cherry-pick PRs from that to hopefully more easily reviewable bits. I probably will not make a PR for the complete branch, but will link to it from the individual PRs so I can rebase it on master as things land. That will give me some confidence we aren't running into any unforeseen issues.
Anyhow it seems we can probably close this one for now, and I'll try and get something to look at in the next couple of days.
Here is my initial exploration, for replacing
Vec<YaccGrammarError>
with error channels in cfgrammar, as was kind of discussed in #428 I figured at least it was worth a shot, because it could be done matching the current ownership structure and giving a decent facsimile of the current API when owning both a send/receive channel.A few things tripped me up, particularly
parse_token
andparse_name
need to returnYaccGrammarError
because sometimes we replace (or drop) the error generated by these functions with a more specific one. particularly in the%expect-unused
code path. I need to more carefully inspect the rest of the code in case there could be more cases such as these.