Azure / azure-storage-java

Microsoft Azure Storage Library for Java
https://docs.microsoft.com/en-us/java/api/overview/azure/storage
MIT License
189 stars 165 forks source link

SAXParser concurrency bug causes list results to appear empty #546

Closed ThomasMarquardt closed 4 years ago

ThomasMarquardt commented 4 years ago

DETAILS: There is a concurrency bug in the XML parsing logic which can cause the results of a List Blobs operation to appear empty.

Please refer to Utility.java#L131. Utility.saxParserThreadLocal is a ThreadLocal with a SAXParserFactory member variable. The factory is initialized in ThreadLocal.initialValue which then continues to use the member variable reference. Because initialValue is called once for each thread, it is possible for Thread A to be after the call to "factory.setNamespaceAware(true)" but before the call to "return factory.newSAXParser()" when it yields to Thread B which calls "factory = SAXParserFactory.newInstance()" and then yields to Thread A which then calls "return factory.newSAXParser()". Since this is a reference to the factory member variable, this instance would therefore not be namespace aware since setNamespaceAware(true) has not been called. It therefore returns a SAXParser that is not namespace aware.

A SAXParser that is not namespace aware will always return a list count of 0. Please refer to BlobListHandler.java#L84. BlobListHandler.startElement takes the localName parameter and compares it with "Blob" on line 84, to find each blob item in the list. When a SAXParser is not namespace aware, the localName parameter is always empty. Please refer to the documentation for DefaultHandler.startElement, which describes the localName parameter as follows, "The local name (without prefix), or the empty string if Namespace processing is not being performed." This was also confirmed by testing, so the documentation is correct. Thus when the SAXParser is not initialized correctly (is not namespace aware) the results of List Blobs operation will always appear to be empty.

TESTS: The test testSAXParserConcurrency was added to validate that the SAXParser returned by Utility.getSAXParser is correctly configured, even when called under a highly concurrent load.

This test will occassionaly fail without the fix, but the probability of faiulre is very low. To increease the likelihood of failure you can add calls to sleep before and after the call to "factory.setNamespaceAware(true)" in Utitlity.java as shown below:

private static void sleep(long l) {
    try {
        Thread.sleep(l);
    }
    catch (InterruptedException e) {
    }
}

/**
 * Thread local for SAXParser.
 */
private static final ThreadLocal<SAXParser> saxParserThreadLocal = new ThreadLocal<SAXParser>() {
    SAXParserFactory factory;
    @Override public SAXParser initialValue() {
        factory = SAXParserFactory.newInstance();
        sleep(100);
        factory.setNamespaceAware(true);
        sleep(10);
        try {
            return factory.newSAXParser();
        } catch (SAXException e) {
            throw new RuntimeException("Unable to create SAXParser", e);
        } catch (ParserConfigurationException e) {
            throw new RuntimeException("Check parser configuration", e);
        }
    }
};