eclipse-ee4j / eclipselink

Eclipselink project
https://eclipse.dev/eclipselink/
Other
200 stars 171 forks source link

NPE in NullPointerException in ExpressionOperator.printCollection since 2.7.11+, 3.0.0+ #2136

Open igormukhin opened 6 months ago

igormukhin commented 6 months ago

Describe the bug

Since switching to EL 2.7.14 we are getting a mysterious NPE at ExpressionOperator.printCollection(). The exception happens only occasionally from time to time (like once a week) on different queries.

It also happend to other people #1867 and #1717

To Reproduce

I don't know how to reproduce it, but it is clearly a bug in the source code. Read further.

UPDATE 10.06.2024 I created a sample project that reproduces the issue: https://github.com/igormukhin/eclipselink-issue-1717-test The problem is when a queries are running in parallel threads and use COALESCE(...) function or CASE WHEN statement.

Expected behavior

Same as before v2.7.11.

UPDATE 10.06.2024 The following problem description and the solution can be ignored as it is invalid.

The Problem

In the class ExpressionOperator there is a field argumentIndices

Up to 2.7.10 the field was used in the method printCollection like this:

    public void printCollection(List<Expression> items, ExpressionSQLPrinter printer) {
        // ... skipped some code ...

        if (this.argumentIndices == null) {
            this.argumentIndices = new int[items.size()];
            for (int i = 0; i < this.argumentIndices.length; i++){
                this.argumentIndices[i] = i;
            }
        }

        // ... skipped some code ...
        for (final int index : argumentIndices) {
            Expression item = items.get(index);
            // ... skipped some code ...
                item.printSQL(printer);
           // ... skipped some code ...
        }
    }

In 2.7.11 the line for (final int index : argumentIndices) was changed to for (int i = 0; i < this.argumentIndices.length; i++) That is causing the exception.

The field argumentIndices can be changed while inside the loop because the loop calls item.printSQL(printer) which in turn can call ExpressionOperator.copyTo where the field argumetIndices on the already used operator instance will be set to null.

I can't explain why it does work most of the time and only occasionally fails but it is clearly the reason for the NPE.

Solution

While it is not the ultimate solution for the underlying problem (the field can be changed), we still can fix it with just by returning back to caching the value of argumentIndices while looping which is already done in 3.1.0-M1 and 4.0.0 but not in 2.7.x, 3.0.x

jnehlmeier commented 5 months ago

The issue doesn't seem to be fixed as noted in latest comment in https://github.com/eclipse-ee4j/eclipselink/issues/1717

2.7.15 has this fix applied applied but is still affected.

igormukhin commented 5 months ago

I was wrong with my PR #2137. My solution helped nothing.

So I just created a sample project that reproduces the issue: https://github.com/igormukhin/eclipselink-issue-1717-test

The problem is the concurrency and COALESCE function or CASE WHEN.

@rfelcman Can you please reopen this issue? I can then try to hunt down the error.

igormukhin commented 5 months ago

The root of all evil here is probably the fact that DatabasePlatform provides the same instance of ExpressionOperator which has a state field (argumentIndexes). This, of course, leads to errors when the same operator is used in different threads.

Currently, that involves ArgumentListExpressionOperator (and its parent ListExpressionOperator) which is used in these expressions:

        addOperator(ExpressionOperator.coalesce());
        addOperator(ExpressionOperator.caseStatement());
        addOperator(ExpressionOperator.caseConditionStatement());

It look like somebody tried to solve the issue years ago (Bug 463042) but not so successfully. We probably should look what happened to these classes in the version 2.7.12 where the problem was introduced. Will take a look later.

igormukhin commented 5 months ago

Looks like it was this commit with changes in ArgumentListFunctionExpression.java that introduced the issues.