ModelDriven / Alf-Reference-Implementation

Open-source implementation of the Action Language for fUML (Alf) specification.
30 stars 2 forks source link

Issue #75 Parser should identify all syntax errors in an Alf text #84

Closed abstratt closed 5 years ago

abstratt commented 5 years ago
abstratt commented 5 years ago

Thanks for the feedback, @seidewitz. I applied the first one, as it was a minor syntactical change, and merged the PR. I added your two other points to the issue itself (#75) so I can address that in one or more follow-up PRs, to make it easier to see those changes separately from what has been done here so far.

abstratt commented 5 years ago

@seidewitz Couple questions for you. Will split into two comments.

The method org.modeldriven.alf.fuml.units.ModelNamespaceImpl.resolveModelFile needs to be updated to report parsing errors based on the new mechanism. (However, TokenMgrError exceptions for lexical errors still need to be caught at this point.)

I can read that two ways:

a) ModelNamespaceImpl.resolveModelFile() should return a (potentially ill-formed/incomplete) UnitDefinition even in the case of parsing errors. It could dump the collection of parsing problems to the console as it currently does when an exception is caught (it would not expose the problems found otherwise), but it would not longer return null in those cases. And in case of other kinds of exceptions/errors being thrown (like a TokenMgrErro), it would continue to do what it does currently (dumps exception message and returns null).

or

b) It should expose some protocol that allowed its clients to enumerate all the problems found.

I am currenly assuming you meant "a". Please let me know if I got it backwards.

Also, after reading your comment, I noticed I need to update AlfOpaqueBehaviorExecution.parse() as well so it no longer expects parsing problems to be reported as a ParseException, but should bail with a FumlException in the presence of any collected problems, just as it does already with ContraintViolations. Correct?

abstratt commented 5 years ago

@seidewitz Part 2:

There needs to be an exception handler in UnitDefinitionEOF, too. If the first token of the unit is bad, then this causes an exception in UnitDefinition, which is caught. However, it seems that trying to then consume the next token, to check for the EOF, causes an exception to be thrown in UnitDefinitionEOF, which is not caught. For example, trying to parse activit A() { } using UnitDefinitionEOF results in a ParseException being thrown. (I have not checked to see if the same problem occurs with the other "EOF" productions.)

We added these new methods to Parser to expose support for error recovery:

  public UnitDefinition parseUnitDefinition(boolean eof);
  public Block parseStatementSequence(boolean eof);
  public Expression parseExpression(boolean eof);

to replace the existing production methods, which do not support error recovery. They were not removed from the interface, as I believe you want to preserve them (something about minimizing the delta to the research branch), however as far as I know, considering I would update AlfOpaqueBehaviorExecution to no longer call StatementSequenceEOF(), they will no longer be invoked (and I think they should eventually be removed from the Parser interface.

Can you still see a problem similar to what you describe if you use exclusively the new parseXYZ methods?

BTW, I would like to add deprecations to those methods. :+1: or :-1:?

  @Deprecated
  /** Will be removed. Call #parseUnitDefinition(false) instead */
  public UnitDefinition UnitDefinition() throws ParseException;
  @Deprecated
  /** Will be removed. Call #parseUnitDefinition(true) instead */
  public UnitDefinition UnitDefinitionEOF() throws ParseException;
  @Deprecated
  /** Will be removed. Call #parseStatementSequence(false) instead */
  public Block StatementSequence() throws ParseException;
  @Deprecated
  /** Will be removed. Call #parseStatementSequence(true) instead */
  public Block StatementSequenceEOF() throws ParseException;
  @Deprecated
  /** Will be removed. Call #parseExpression(false) instead */
  public Expression Expression() throws ParseException;
  @Deprecated
  /** Will be removed. Call #parseExpression(true) instead */
  public Expression ExpressionEOF() throws ParseException;
abstratt commented 5 years ago

Oh, part 3: here are the changes I intend to commit to AlfOpaqueBehaviorExecution:

diff --git a/org.modeldriven.alf.fuml.impl/src/org/modeldriven/alf/fuml/impl/environment/AlfOpaqueBehaviorExecution.java b/org.modeldriven.alf.fuml.impl/src/org/modeldriven/alf/fuml/impl/environment/AlfOpaqueBehaviorExecution.java
index bf07dc9..4f8eabb 100644
--- a/org.modeldriven.alf.fuml.impl/src/org/modeldriven/alf/fuml/impl/environment/AlfOpaqueBehaviorExecution.java
+++ b/org.modeldriven.alf.fuml.impl/src/org/modeldriven/alf/fuml/impl/environment/AlfOpaqueBehaviorExecution.java
@@ -17,10 +17,11 @@
 import org.modeldriven.alf.fuml.mapping.FumlMappingFactory;
 import org.modeldriven.alf.fuml.mapping.units.MemberMapping;
 import org.modeldriven.alf.mapping.MappingError;
-import org.modeldriven.alf.parser.ParseException;
 import org.modeldriven.alf.parser.ParserImpl;
+import org.modeldriven.alf.parser.ParsingProblem;
 import org.modeldriven.alf.parser.TokenMgrError;
 import org.modeldriven.alf.syntax.common.ConstraintViolation;
+import org.modeldriven.alf.syntax.common.SourceProblem;
 import org.modeldriven.alf.syntax.common.impl.ElementReferenceImpl;
 import org.modeldriven.alf.syntax.statements.Block;
 import org.modeldriven.alf.syntax.units.ActivityDefinition;
@@ -115,27 +116,21 @@
        parser.setFileName(behavior.name);

        try {
-           Block body = parser.StatementSequenceEOF();
-           
+           Block body = parser.parseStatementSequence(true);
+
+            Collection<ParsingProblem> parsingProblems = parser.getProblems();
+            checkForProblems("Parsing problems", parsingProblems);
+
            UnitDefinition unit = makeUnitForBehavior(behavior, body);

            NamespaceDefinition modelScope = RootNamespace.getModelScope(unit);
             modelScope.deriveAll();

             Collection<ConstraintViolation> violations = modelScope.checkConstraints();
-            if (violations.isEmpty()) {
-               return unit;
-            } else {
-               StringBuffer msg = new StringBuffer("Constraint violations:\n");
-                for (ConstraintViolation violation: violations) {
-                    msg.append(violation + "\n");
-                }
-                throw new FumlException(msg.toString());
-            }
+            checkForProblems("Constraint violations", violations);
+            return unit;
         } catch (TokenMgrError e) {
             throw new FumlException(e.getMessage());
-        } catch (ParseException e) {
-            throw new FumlException(e.getMessage());
         }
    }

@@ -198,5 +193,14 @@
        return (Element)((org.modeldriven.alf.fuml.impl.uml.Element)mapping.
                getElement()).getBase();
    }
-   
+
+    private static void checkForProblems(String title, Collection<? extends SourceProblem> problems) {
+        if (!problems.isEmpty()) {
+            StringBuffer msg = new StringBuffer(title + ":\n");
+            for (SourceProblem problem: problems) {
+                msg.append(problem + "\n");
+            }
+            throw new FumlException(msg.toString());
+        }
+    }  
 }
seidewitz commented 5 years ago

Part 1

(a) is correct.

Part 2

Sorry, when I reviewed the PR before, I hadn't properly considered that you intended for the old parsing methods to be deprecated and replaced by the new methods. I am actually not comfortable with this. As I mentioned before, I want to minimize changes to the Parser interface. Changing the parsing method calls is just too likely to break things that we don't need to break at this point.

So, please remove the new parse methods from the interface and add exception handlers to the EOF methods. I believe this will then still cover all the cases that had been covered by the methods you added.

For now, I think that the only change needed to the Parser interface is to add the getProblems method. Once we have everything working again with the new error recovery approach, then we can consider potentially refactoring the interface further.

Part 3

I had forgotten about AlfOpaqueExpression. I haven't actually used that in a while. However, you are right, it should be updated.

Your proposed updates look fine (except, per the above, don't change the call to StatementEOF).

abstratt commented 5 years ago

Hi @seidewitz, thanks for the quick reply.

So, please remove the new parse methods from the interface and add exception handlers to the EOF methods. I believe this will then still cover all the cases that had been covered by the methods you added.

There is a challenge here. The production methods are generated by JavaCC, so we cannot control their signatures. They must be declared to throw ParseException, which doesn't make much sense if we agree we don't want them to ever throw ParseExceptions but instead do a best effort in error recovery.

Of course, we could leave the throws ParseException strictly for legacy compatibility purposes but not actually ever throw those exceptions. Is that what you had in mind? To me this has always been an either/or thing - either you use the protocol that supports error recovery and never throws exceptions, or you use the protocol that fails with an exception on first error but don't get error recovery. We can provide both, but their usage models/use cases are not the same.

seidewitz commented 5 years ago

Yes, leave the throw declarations on. That way, some implementations of the interface can use the exception while others can implement getProblems (which should be defaulted to return an empty list). Of course, this means that calling code will still need an exception handler for ParseException, as well as checking getProblems. This is a bit ugly, but, as I noted, we once everything is working we can consider improving this on another step.

seidewitz commented 5 years ago

Actually, let me reconsider that. The new methods really do make more sense for the new approach. But, if we introduce them, I just want to make sure that everything is updated properly and we don't introduce problems that are not flagged statically (the changes needed to both resolveModelFile and AlfOpaqueExpression are cases in point, but there are other places, too, not necessarily all in this code base).

Deprecation of the old methods isn't quite right, because their functionality has changed. Even when called directly, they will (generally) not throw ParseException any more, so code that is not changed to call getProblems will break, even though it will still compile.

So, let's just go all the way and remove the old methods from the interface now. This will mean that any existing code that uses the old methods will get compilation errors from the new interface, which will highlight exactly the places that need to be changed. And it is probably best to just bite the bullet and make the changes sooner rather than later.

Sorry for going back and forth on this. I have other things on my mind, so it took a while to give it an extra think or two!

abstratt commented 5 years ago

@seidewitz See an example of a call site updated to use the new methods in Parser here:

https://github.com/ModelDriven/Alf-Reference-Implementation/commit/d09e8d7c60e29b8be9c1012441dd7c13e41bf632#diff-c6d3d2630dcbe97ae7d867e851395cddR121