statisticsnorway / java-vtl

An Open Source Java implementation of the Validation Transformation Language, based on the VTL 1.1 draft specification. The implementation follows the JSR-223 Java Scripting API and exposes a simple connector interface one can implement in order to integrate with any data stores. VTL is a standard language for defining validation and transformation rules (set of operators, their syntax and semantics) for any kind of statistical data.
http://java-vtl.org
Apache License 2.0
21 stars 7 forks source link

Boolean expression in filter should be normalized #107

Closed hadrienk closed 5 years ago

hadrienk commented 5 years ago

Expressions where the first operand is a literal always return false: filter "literal" = variable

bjornandre commented 5 years ago

Seems to work on the develop branch. See this test:

  @Test
    public void testFilterOnStringLiteral() throws Exception {
        Dataset ds1 = StaticDataset.create()
                .addComponent("id", Role.IDENTIFIER, String.class)
                .addComponent("text", Role.MEASURE, String.class)
                .addPoints("1", "include")
                .addPoints("2", "ignore")
                .build();

        bindings.put("ds1", ds1);
        engine.eval("ds2 := [ds1] {" +
                "  filter \"include\" = ds1.text" +
                "}"
        );

        DataPoint point = ((Dataset) bindings.get("ds2")).getData().findFirst().get();
        assertThat(point)
                .extracting(VTLObject::get)
                .containsExactly("1", "include");
    }
hadrienk commented 5 years ago

I wasn't clear enough. The problem happens during filter propagation so just one operation will work fine. The problem resides in the FilterSpecification conversion.

The FilterSpecification is modelled with column, operator and operand where operand I either another FilterSpecification or a literal. While in the vtl parser the Boolean expressions are operand operator operand (X = Y) where both operands can be a literal ("a" = "b" is valid)

bjornandre commented 5 years ago

Hmm. It is still not clear to me. What about the following test case? This still works, and involves FilterSpecification conversion:

    @Test
    public void testFilterOnStringLiteral() throws Exception {
        Dataset ds1 = StaticDataset.create()
                .addComponent("id", Role.IDENTIFIER, String.class)
                .addComponent("text", Role.MEASURE, String.class)
                .addPoints("1", "include")
                .addPoints("2", "ignore")
                .build();

        bindings.put("ds1", ds1);
        engine.eval("ds2 := [ds1] {" +
                "  filter true" +
                "}" +
                " ds3 := [ds2] {" +
                "  filter \"include\" = ds2.text" +
                "}"
        );
        DataPoint point = ((Dataset) bindings.get("ds3")).getData().findFirst().get();
        assertThat(point)
                .extracting(VTLObject::get)
                .containsExactly("1", "include");
    }
hadrienk commented 5 years ago

You are right, my bad. This is handled in the VtlFilteringConverter.

        // We only support filter in the form of variable OP literal.
        if ((leftOperand instanceof VariableExpression && rightOperand instanceof LiteralExpression) ||
                (leftOperand instanceof LiteralExpression && rightOperand instanceof VariableExpression)) {
            VariableExpression variableExpression = (VariableExpression) (leftOperand instanceof VariableExpression ?
                    leftOperand : rightOperand);
            LiteralExpression literalExpression = (LiteralExpression) (leftOperand instanceof LiteralExpression ?
                    leftOperand : rightOperand);