MimeTypeUtils.parseMimeType returns null MIME type in case of high concurrency

hdeadman commented 5 years ago

Affects: 5.5.0.M2 + (Problem observed in 5.5.0.M2, probably in any release after 2/5/2019)

I am getting an NPE on line MediaType.java:550 of spring-web-5.2.0.M2.jar (currently line 563 in master) because the cache is apparently returning a null type from MimeTypeUtils.parseMimeType().

    public static MimeType parseMimeType(String mimeType) {
        return cachedMimeTypes.get(mimeType);
    }

and the method below doesn't account for a null return from the LRU cache cachedMimeTypes

    public static MediaType parseMediaType(String mediaType) {
        MimeType type;
        try {
            type = MimeTypeUtils.parseMimeType(mediaType);
        }
        catch (InvalidMimeTypeException ex) {
            throw new InvalidMediaTypeException(ex);
        }
        try {
            return new MediaType(type.getType(), type.getSubtype(), type.getParameters()); //NPE
        }
        catch (IllegalArgumentException ex) {
            throw new InvalidMediaTypeException(mediaType, ex.getMessage());
        }
    }

The ConcurrentLruCache in MimeTypeUtils must have a bug b/c that is the only way null type could be getting returned. The ConcurrentLruCache.get(key) method does returns in two places and one of them is returning a null. The return in the write lock block looks safe but the return in the read lock block could return null if the internal queue has the same key in it twice at which point the internal cache map wouldn't have the value anymore and then a null could be returned.

        public V get(K key) {
            this.lock.readLock().lock();
            try {
                if (this.queue.remove(key)) {
                    this.queue.add(key);
                    return this.cache.get(key);
                }
            }
            finally {
                this.lock.readLock().unlock();
            }
            this.lock.writeLock().lock();
            try {
                if (this.queue.size() == this.maxSize) {
                    K leastUsed = this.queue.poll();
                    if (leastUsed != null) {
                        this.cache.remove(leastUsed);
                    }
                }
                V value = this.generator.apply(key);
                this.queue.add(key);
                this.cache.put(key, value);
                return value;
            }
            finally {
                this.lock.writeLock().unlock();
            }
        }

Assume two threads go through read lock block at same time, one of them will remove the key and add it back to front of queue, the other thread will fall through and wait for the write lock. Once the thread gets the write lock it will also add the key to the queue and now the key will be in the queue twice. Eventually the duplicate key might work its way to least used and get removed from the cache. At that point, all subsequent requests for that key will return null because the queue still has the key but the hash map doesn't have the value.

This is pretty serious b/c once it starts happening for a particular mime type, I think the application needs to be restarted.

bclozel commented 5 years ago

Reading your analysis, it seems we could fix that problem and even improve the performance by doing this:

public V get(K key) {
    this.lock.readLock().lock();
    try {
        if (this.queue.remove(key)) {
            this.queue.add(key);
            return this.cache.get(key);
        }
    }
    finally {
        this.lock.readLock().unlock();
    }
    this.lock.writeLock().lock();
    try {
        // retrying in case of concurrent reads on the same key
        if (this.queue.remove(key)) {
            this.queue.add(key);
            return this.cache.get(key);
        }
        if (this.queue.size() == this.maxSize) {
            K leastUsed = this.queue.poll();
            if (leastUsed != null) {
                this.cache.remove(leastUsed);
            }
        }
        V value = this.generator.apply(key);
        this.queue.add(key);
        this.cache.put(key, value);
        return value;
    }
    finally {
        this.lock.writeLock().unlock();
    }
}

In this case, we avoid adding again the same key and thus returning null when trying to fetch the value from the cache. As a consequence, we avoid re-generating and caching again the same MIME type (which should save CPU cycles). It is true that concurrent reads like this are likely to trigger more and more write locks and increase contention. I'm wondering if we should replace the LRU strategy with a Least Frequently Used strategy to improve throughput.

Thanks for the analysis, it's really useful - more importantly, thanks for using our milestones in a traffic-heavy environment, this is really helping the Spring community, especially for that type of issues.

bclozel commented 5 years ago

I've reproduced the issue in a JMH benchmark (increasing the thread count is essential). When concurrency is high, the LRU cache might yield lower throughput when compared to a parser without any cache. On the other hand, the allocation rate is much lower. When the concurrency is low, the cached version is at least 4x faster.

In the various web benchmarks that we previously ran, the cached version had a positive impact on the performance (throughput and especially latency); the contention and concurrency on this specific part is probably not as high as the JMH benchmark.

We could improve this even more by switching to a different parser implementation. I've tested a different parser implementation that's approx. +50% faster. When combined with a cache, there's not much difference anymore.

Since the LRU cache implementation is internal, we can still reconsider using a LFU cache instead.

spring-projects / spring-framework

MimeTypeUtils.parseMimeType returns null MIME type in case of high concurrency #23211

Affects: 5.5.0.M2 + (Problem observed in 5.5.0.M2, probably in any release after 2/5/2019)