GeoWebCache / geowebcache

GeoWebCache is a tile caching server implemented in Java that provides various tile caching services like WMS-C, TMS, WMTS, Google Maps, MS Bing and more
https://www.geowebcache.org
335 stars 280 forks source link

SeederThreadPoolExecutor seemingly ignoring "maxPoolSize" setting and is stuck on 16 thread maximum #1220

Open mbosecke opened 4 months ago

mbosecke commented 4 months ago

No matter how many seed tasks I submit to GWC, it only will ever run 16 at a time. After looking at the code it seems like SeederThreadPoolExecutor is configured with a corePoolSize of 16 and a maxPoolSize of 32, however, it's also using an underlying "LinkedBlockingQueue" to store queued tasks and the docs say:

Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.)

Effectively, the queue is infinitely large, therefore it never fills up and therefore the ThreadPoolExecutor never adds threads into the pool above and beyond the original 16.

(Bonus: it would also be a nice to expose a way for end-users to change the min/max size of the thread pool)

mbosecke commented 4 months ago

Proposed solution:

Some ideas can be found in how the built-in "Executors" java class constructs various specialised thread pools.

mbosecke commented 4 months ago

To provide some context on my use case, I have a machine with 80 cores that I'd like to leverage to seed the hell out of some layers.

aaime commented 4 months ago

@mbosecke at first I was surprised by it turns out that indeed, you're right, when executing "submit" the code creates a new thread only if there are less than "corePoolSize", otherwise tries to submit to the queue first, and only if this one refuses, then an attempt to go beyond is made (with eventual failure if the max pool size is reached).

Given seeds typically happen in bursts, I agree with your proposal, probably better to set core and max at the same value, and have a timeout and let the threads die when unused after some time (e.g., one minute?).

Given that SeederThreadLocalTransferExecutor is used only in GeoWebCache, I'd suggest creating a no-argument constructor, that uses a sane default, and can be controlled via system/environment/servlet context variable:

 private static final int POOL_SIZE;

    static {
        POOL_SIZE =
                Optional.ofNullable(GeoWebCacheExtensions.getProperty("gwc.seeder.pool.size"))
                        .map(Integer::parseInt)
                        .orElse(Runtime.getRuntime().availableProcessors() * 2);
    }

    public SeederThreadPoolExecutor() {
        super(POOL_SIZE, POOL_SIZE, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>(), tf);
        allowCoreThreadTimeOut(true);
    }

The creation of the pool in the Spring app context would use such default constructor. The SeederThreadLocalTransferExecutor would be modified accordingly, and also use the default constructor in the Spring context.

@smithkm @groldan opinions?

mbosecke commented 4 months ago

In the case where GWC is being used within geoserver, does this thread pool get implicitly used during normal operations (ex a WMS request comes in and a metatile gets cached)? If so, I'd make an argument that we may want to keep the thread pool primed with some idle threads somehow. However, I suspect that's not the case, and it's only used when someone is explicitly seeding, in which case I think it's reasonably for it to dwindle itself down to 0 threads over the course of a minute.

aaime commented 4 months ago

Yes, ideally I'd love to have the pool stay at corePoolSize, expand up to the max one under load, and start adding to the queue only after all the workers are busy. But I don't see how to get that, the pool seems to offer stuff to the queue first, and only expand the set of workers later, which does not seem that useful to me.

aaime commented 1 month ago

It seems nobody has so far found a good way out. Given this, I'd be ok setting the minimum pool size equal to the maximum to better leverage machines with many cores. I've seen as a common practice to spin-up a powerful, but short lived machine to perform seed jobs on a specific layer, this issue is basically at odds with it, the dedicated hardware won't be used properly, if there are more than 8 cores (typical peak throughput point is concurrent requests at 2-4 times the number of cores).

@smithkm @groldan thoughts?