Name resolution failures for package names with white space

lwrage commented 4 years ago

Summary

Spaces in package names are not marked as an error in the AADL text editor. In references the spaces are treated as part of the name and must be written exactly the same.

Similarly for other white space, such as tabs and newline.

Steps to Reproduce

The following is valid:

package A:: B
end A:: B;

Environment

OSATE Version: 2.6.0
Operating System: all

AaronGreenhouse commented 4 years ago

Actually it's a bit stranger than described above. If the package declaration contains any amount of whitespace, that is 1 or more of space, newline, tab characters, then references to the package name must also contain 1 or more whitespace characters, but the whitespace doesn't have to match. So, if the package declaration is package A:: B (with three spaces), the references can be A:: B (1 space), A:: B (five spaces),

A::
B

etc. But A::B (with no space) does not match.

What is also strange, is that the closing end for the package doesn't ever seem to match the name, whether the whitespace matches exactly or not.

AaronGreenhouse commented 4 years ago

Furthermore, whitespace before and after the alphanumeric segment are handled separately. Consider the cases:

package A::B::C
package A:: B::C
package A::B ::C 4.. package A:: B ::C

Case (1) requires that there be no whitespace around B in references. Case (2) requires that there be whitespace before B, but not after B in references. Case (3) requires that there be whitespace after B, but not before B in references. Case (4) requires that there be whitespace both before and after B in references.

lwrage commented 4 years ago

Funny. However, for fixing the issue it's sufficient to just validate the package name.

AaronGreenhouse commented 4 years ago

I think the problem is in Aadl2QualifiedNameConverter.toQualifiedName(). The qualified name strings that are passed into it contain spaces if the original source text has whitespace. I think the solution is to remove any leading and trailing whitespace from each segment in the name before creating the actual QualifiedName object.

lwrage commented 4 years ago

That would ignore whitespace. I think it's better to have an error in the editor and have the user fix it.

AaronGreenhouse commented 4 years ago

Update. While the above would work, it's not really the source of the problem. The problem is that the name strings contain spaces in to begin with. The root of that problem seems to be NodeModelUtils.getTokenText(), which inserts one space for each hidden token in the source text. This is deliberate as indicated in the JavaDoc.

    /**
     * This method converts a node to text.
     * 
     * Leading and trailing text from hidden tokens (whitespace/comments) is removed. Text from hidden tokens that is
     * surrounded by text from non-hidden tokens is summarized to a single whitespace.
     * 
     * The preferred use case of this method is to convert the {@link ICompositeNode} that has been created for a data
     * type rule to text.
     * 
     * This is also the recommended way to convert a node to text if you want to invoke
     * {@link org.eclipse.xtext.conversion.IValueConverterService#toValue(String, String, INode)}
     * 
     */
    public static String getTokenText(INode node) {
        if (node instanceof ILeafNode)
            return ((ILeafNode) node).getText();
        else {
            StringBuilder builder = new StringBuilder(Math.max(node.getTotalLength(), 1));
            boolean hiddenSeen = false;
            for (ILeafNode leaf : node.getLeafNodes()) {
                if (!leaf.isHidden()) {
                    if (hiddenSeen && builder.length() > 0)
                        builder.append(' ');
                    builder.append(leaf.getText());
                    hiddenSeen = false;
                } else {
                    hiddenSeen = true;
                }
            }
            return builder.toString();
        }
    }

It seems very strange to me that that Xtext is setup to work like this and yet doesn't handle the spaces that appear in the identifier names.

getTokenText() is being called by LinkingHelper.getCrossRefNodeAsString():

    public String getCrossRefNodeAsString(INode node, boolean convert) {
        String convertMe = NodeModelUtils.getTokenText(node);
        if (!convert)
            return convertMe;
        try {
            String ruleName = getRuleNameFrom(node.getGrammarElement());
            if (ruleName == null)
                return convertMe;
            Object result = valueConverter.toValue(convertMe, ruleName, node);
            return result != null ? result.toString() : null;
        } catch (ValueConverterWithValueException ex) {
            Object result = ex.getValue();
            return result != null ? result.toString() : null;
        } catch (ValueConverterException ex) {
            throw new IllegalNodeException(node, ex);
        }
    }

The convert parameter is true. I still need to see what the "conversion" does. Maybe there is a step here that is supposed to be removing the spaces. I wonder if we are overriding something somewhere that in its default form handles the spaces correctly. It just doesn't make sense to me that by default Xtext would leave spaces in identifier names.

lwrage commented 4 years ago

https://stackoverflow.com/questions/31277134/xtext-controlling-when-whitespace-is-allowed

lwrage commented 4 years ago

Package name (parser rule QPREF in Properties.xtext) is not a terminal but a datatype rule (in Xtext terminology). Same for QCREF.

AaronGreenhouse commented 4 years ago

Tried adding hidden() to the rules PNAME, QPREF, and QCREF.

This works, but the parser error messages are really hard to understand.

I was convinced the Xtext had a mechanism for dealing with the extra whitespace in qualified names, considering that the method getTokenText() deliberately inserts it, and that we must have been overriding that something incorrectly and messing it up. To test this, I build the SmallJava example for the Xtext book. Unfortunately, this example shows that same problems with qualified names as does AADL. There is no default mechanism. Very disappointing and frankly it seems like an oversight in Xtext.

In any case, one clean way to deal with this is through a ValueConvertor. This is what is used by the "convert" process in getCrossRefNodeAsString(). We can install value convertors for rules related to qualified names, and it should solve the problem.

We now have three ways to address this issue:

Update the parse rules to use hidden().
Update the validator to reject qualified names that have spaces in their name segments.
Update the value convertors.

Approach (1) works, but the parser error messages suck. Approach (2) wouldn't be hard, but both (1) and (2) suffer from the problem that they would reject

system S extends x:: -- commment here
   T
end S;

And that seems wrong. This might be avoidable in (2), I'm not sure what information would be available regarding hidden nodes, but it's definitely rejected by (1).

Approach (3) also seems pretty straightforward, but it means that we need special handing of "end X::Y::Z" because that is not handled in the meta model. We also need to check that the rename refactoring works correctly.

Lutz has chosen to go with approach (3).

AaronGreenhouse commented 4 years ago

Need to update the ValueConvertors for

Aadl2.PNAME
Aadl2.FQCREF
Properties.QPREF
Properties.QCREF

AaronGreenhouse commented 4 years ago

I added value convertors that remove spaces for the above rules. This fixed the problem for the most part.

As noted, the end on the package declaration is checked specially. and doesn't work properly yet. Spaces seem to be okay, but not comments. That is,

end x:: 
Y::

 barney;

will correctly match with package x::y::barney, but

end x::
  -- comment
   -- comment
   fred;

does not match with package x::fred.

AaronGreenhouse commented 4 years ago

Need to fix Aadl2JavaValidator.checkEndId(ModelUnit)

AaronGreenhouse commented 4 years ago

Interestingly, Aadl2JavaValidator.checkEndId(Classifier) was already handling this case.

The method checkEndId(ModelUnit) was editing the input name using

        String ss = lln.getText().replaceAll(" ", "").replaceAll("\t", "").replaceAll("\n", "").replaceAll("\r", "");

whereas, checkEndId(Classifier) is using

        String ss = lln.getText().replaceAll("--.*(\\r|\\n)", "").replaceAll(" ", "").replaceAll("\t", "")
                .replaceAll("\n", "").replaceAll("\r", "");

Just need to update the model unit method.

AaronGreenhouse commented 4 years ago

Fixed above.

Added JUnit test for the issue.

osate / osate2