HtmlUnit / htmlunit

HtmlUnit is a "GUI-Less browser for Java programs".
https://www.htmlunit.org
Apache License 2.0
875 stars 172 forks source link

AbstractDomNodeList#getLength() throws NPE due to DomHtmlAttributeChangeListenerImpl clearing cachedElements_ during execution #882

Closed CryDeb closed 3 weeks ago

CryDeb commented 1 month ago

In org.htmlunit.html.AbstractDomNodeList, the method getLength() can throw a NullPointerException (NPE) when getNodes() returns null. This happens because DomHtmlAttributeChangeListenerImpl clears the cache by setting cachedElements_ to null in a different thread.

The issue occurs in the following code from the getNodes() method:

private List<E> getNodes() {
    if (cachedElements_ == null) {
        if (node_ == null) {
            cachedElements_ = new ArrayList<>();
        } else {
            cachedElements_ = provideElements();
        }
    }
    return cachedElements_;
}

The DomHtmlAttributeChangeListenerImpl class contains the following method, which is responsible for clearing the cache:

private static final class DomHtmlAttributeChangeListenerImpl {
    private void clearCache() {
        if (nodeList_ != null) {
            final AbstractDomNodeList<?> nodes = nodeList_.get();
            if (nodes != null) {
                nodes.cachedElements_ = null
            }
        }
    }
}

Root Cause: The clearCache() method is called in a different thread, and it sets cachedElements_ to null while another thread is executing getNodes(). This results in a race condition, where the null check in getNodes() passes, but before the return statement is reached, cachedElements_ is set to null by another thread, leading to a NullPointerException in getLength().

Expected Behavior: The getLength() method should handle concurrent modification of cachedElements_ gracefully and not throw a NullPointerException.

Actual Behavior: A NullPointerException is thrown when cachedElements_ is cleared by another thread during the execution of getNodes().

Suggested Fix: Introduce synchronization or use thread-safe mechanisms to manage the cache in AbstractDomNodeList to avoid race conditions between getNodes() and clearCache().

Example getNodes():

private synchronized List<E> getNodes() {
    if (cachedElements_ == null) {
        if (node_ == null) {
            cachedElements_ = new ArrayList<>();
        } else {
            cachedElements_ = provideElements();
        }
    }
    return cachedElements_;
}

Example clearCache():

        private void clearCache() {
            if (nodeList_ != null) {
                final AbstractDomNodeList<?> nodes = nodeList_.get();
                if (nodes != null) {
                    synchronized (nodes) {
                        nodes.cachedElements_ = null;
                    }
                }
            }
        }
rbri commented 1 month ago

Have deployed a new snapshot that might reduce the problem. Will work on that during the next days, but i like to avoid synchronization as much as possible because we already have some risk of deadlocks and maybe also an performance impact.

CryDeb commented 1 month ago

Returning the assigned value directly after initializing it reduces the risk of cachedElements_ being overwritten between the assignment and the return. Once the value is set, it won’t be modified during the method execution.

To prevent cachedElements_ from being cleared (set to null) between the null check and the return, you can store the value of cachedElements_ in a temporary variable. This ensures that you return a valid, non-null reference, even if another thread nullifies cachedElements_ after the check but before the return.

I believe the issue of cachedElements_ being cleared after assigning it to the temporary variable can likely be overlooked, though that's just an assumption on my part.

Let me know if you'd like further tweaks!

Here’s how the modified code could look like:

private List<E> getNodes() {
    List<E> shortLivedCache = cachedElements_;
    if (shortLivedCache == null) {
        return cachedElements_ = provideElements();
    }
    return shortLivedCache;
}
rbri commented 1 month ago

@CryDeb thanks a lot for the detailed issue - it's always fun to work issues like that. Have tried to implement a solution without synchronizing the access. Please have a look at test with the latest snapshot.

rbri commented 1 month ago

@CryDeb looks like we had a race condition with our comments :-D. Will build a new snapshot now.

rbri commented 1 month ago

@CryDeb snapshot is available

CryDeb commented 1 month ago

@rbri Thanks a lot! I tested the snapshot you provided, and it worked as expected. The race condition is tricky to test, but our code, which previously failed due to the issue, now passes with this fix. From my side, everything looks good, and the issue can be closed as far as I'm concerned.

However, I still think there might be a potential risk with the current implementation where getNodes() could return null in rare cases. To fully mitigate this, I'd recommend revisiting the proposed change I mentioned earlier. Specifically, storing the cached value in a local variable could help ensure it never returns null even if another thread modifies cachedElements_. But I do know that it is less performant.

Still, thank you very much for the fix.

rbri commented 1 month ago

@CryDeb M... now i see you point. Will change that... Thanks....

rbri commented 1 month ago

@CryDeb have done a next version (closer to your one) but hopefully i have catched one more case. Please have a look...

Regarding that

return cachedElements_ = provideElements();

i was not sure - my feeling is that the assignment and the return are two separate operations and therefore this is another whole. Do you know more about this?

rbri commented 1 month ago

have updated the snapshot again

CryDeb commented 3 weeks ago

@rbri Thanks for the update! I do like the new solution, and after testing it locally, I can confirm that it seems to work as well.

Regarding the return cachedElements_ = provideElements();, I initially thought the return statement would directly return the assigned value. However, as mentioned in the Specification (Java 23, Java8), it’s actually the value of the variable that gets returned. So, avoiding this approach in this case makes perfect sense to me, especially since it also improves readability.

rbri commented 3 weeks ago

@CryDeb thanks for all the support with this. The fix is part of the latest release already - Enjoy