jetty / jetty.project

Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more
https://eclipse.dev/jetty
Other
3.86k stars 1.91k forks source link

Clarify CyclicTimeouts API #12308

Open gregw opened 1 month ago

gregw commented 1 month ago

Jetty version(s) 12.1.x

Enhancement Description

Currently the API for CyclicTimeouts is used in two different ways.

Some usages are modelled on the CyclicTimeout class, where the schedule method needs to be called to cancel the current timeout and set a new one into the future:

Other usages just call schedule once and then mutate the value returned from getExpireNanoTime in the knowledge that by moving it into the future, it can cancel the current expire, or schedule a new one even if it has expired.

Both usages are problematic. The first usage is most similar to the style of CyclicTimeout and feels natural to use. However, because the Iterator<Expirable> is not controlled by the CyclicTimeouts class itself (other than to call remove), it is entirely possible that an Expirable to be "scheduled" without ever being passed in a call to schedule(Expirable). If it is returned by the iterator, then it can be scheduled by returning a value from getExpireNanoTime.

The second style of usage is also strange, because a call to schedule(Expirable) also doesn't really need the Expirable to be passed. That will provoke a first look at that Expirable, but it was just added to the iterator, then it would eventually be seen and scheduled regardless of any call to schedule .

Perhaps schedule(Expirable) should be renamed add(Expirable)? Or maybe more extensive API changes are necessary?

sbordet commented 1 month ago

The 3 examples you link are examples of a new entity being scheduled, so they are not about cancelling timeouts.

The new entry is scheduled not added because the collection is managed elsewhere, and the scheduling has the only purpose of letting CyclicTimeouts know when this new entity expires -- it may be sooner than any other existing entity.

So the semantic of "schedule when created", with subsequent recalculation of the earliest wakeup based on pushed-to-the-future expireNanoTime is really a schedule semantic, not add.

The other usage is when an existing entity changes its expiration, possibly earlier than expected. In this case the entity is scheduled again to allow for the recalculation of the earliest wakeup. This is done, for example, with HTTP2Session.StreamTimeouts when HTTP2Stream.setIdleTimeout() is called: we do not know when the earliest wakeup is, so we schedule again the entity to allow recalculation.

The collection of entities is managed elsewhere, and CyclicTimeouts just manages their expiration. If an entity goes away, it is removed from the collection, and won't be seen expiring. If it expires, then it should be removed from the collection, which could be done directly (by returning true from onExpired()) or indirectly (by returning false from onExpired() but then arranging for the entity to be removed from the collection).

Since the collection of entities is managed elsewhere, CyclicTimeouts needs to be informed of the entities the first time, and again only if the entity expiration could change to be earlier -- if the entity's expireNanoTime gets pushed into the future there is nothing more to do.

In summary, I think the "add" semantic is not the right one, it is really a "schedule" semantic, and I would not make changes to the APIs.

gregw commented 1 month ago

@sbordet sorry but I totally disagree. I wrote the original CyclicTimeout class and I'm confused by this API. What hope do other have.

The ability for a new entity to be scheduled via inclusion in the iterator, without schedule ever being called is non clear, nor documented nor intuitive. In HttpConnection example, it appears that the scheduled HttpChannel is recycled on the connection, so it is scheduled on every send. The scheduling is in no way associated with the iteration (which is itself creating a singleton list on every call for HTTP/1, which is a worry).

I agree add is not the right name, but neither is schedule. Perhaps reschedule?

At the very least, this class needs more javadoc.

gregw commented 1 month ago

@sbordet @lorban

the following "test" illustrates the confusion with the API:

    {
        long now = NanoTime.now();

        List<CyclicTimeouts.Expirable> entities = new ArrayList<>(List.of(
            new SimpleExpirable("one", now + 1),
            new SimpleExpirable("two", now + 2),
            new SimpleExpirable("three", now + 3)
        ));

        CountDownLatch latch = new CountDownLatch(3);
        AtomicReference<Runnable> update = new AtomicReference<>();

        CyclicTimeouts<CyclicTimeouts.Expirable> timeouts = new CyclicTimeouts<>(scheduler)
        {
            {
                update.set(this::onTimeoutExpired);
            }

            @Override
            protected Iterator<Expirable> iterator()
            {
                return entities.iterator();
            }

            @Override
            protected boolean onExpired(Expirable expirable)
            {
                System.err.println(expirable);
                latch.countDown();
                return true;
            }
        };

        //0
        //1 timeouts.schedule(entities.get(0));
        //2 timeouts.schedule(new SimpleExpirable("fake", now));
        //3 timeouts.schedule(new SimpleExpirable("fake", now + TimeUnit.SECONDS.toNanos(10)));
        //4 update.get().run();

        assertTrue(latch.await(1, TimeUnit.SECONDS));
    }

If schedule is:

So this illustrates that while the class is working, the schedule call is not really correctly named nor formulated:

So it is really hard to know when you should call schedule:

That is hard to explain. Perhaps the method should be called "update(Expirable)" and we say that it should be called whenever an expirable changes when it is scheduled? But then the passed expirable is still mostly ignored.

Or better yet, should we just add a new method called "update()" (//4), that scans the iterator and sets the next expiry time. This method should be called whenever there is a substantive change to the collection that was not done by expiry.

lorban commented 1 month ago

@gregw I agree with you, I find this API definitely confusing and error-prone.

The fact that CyclicTimeouts doesn't own the Expirables (like the java.util collections do) is surprising. And the schedule(Expirable) method has a confusing name, as well as a convoluted contract has too many side effects IMHO.

Personally, I'd rather have an API where once Expirables have been handed to CyclicTimeouts, I'm not supposed to modify them directly. But honestly, I thought I understood what problem this class was solving and how until this issue was raised; now I'm uncertain about both.

sbordet commented 1 month ago

The collection is managed externally, that is the use case of CyclicTimeouts, e.g. exchanges in a connection, etc.

CyclicTimeouts manages the timeouts of those entities.

A reschedule() would be forced to re-iterate the whole collection to find the entity that expire earliest, verify that the current wakeup is up-to-date.

Instead, schedule(entity) tells that there is a new entity, and CyclicTimeouts only has to check that one to verify that it is not the earliest.

What would the same code do with a plain Scheduler?

It would: A) add the entity to the collection (so yes, the collection is external and not managed by the scheduler), and B) call scheduler.schedule(entityJustAdded). That is where the API came from. Just CyclicTimeouts is more efficient than a Scheduler.

Do you get confused by Scheduler too? If not, why not? Just think of CyclicTimeouts as a Scheduler and there is no confusion, it's exactly the same.

sbordet commented 1 month ago

@lorban and I discussed this:

We could rename schedule(entity) to scheduleIfEarliest(entity) or similar, but I am not sure if it's better than just improved javadocs.

CyclicTimeouts is optimized for the use cases we have, hence not that generic.

gregw commented 1 month ago

@sbordet

Do you get confused by Scheduler too? If not, why not?

Because a Scheduler doesn't:

Just think of CyclicTimeouts as a Scheduler and there is no confusion, it's exactly the same. It is absolutely not the same. The current CTs.schedule(Expirable) and S.schedule(Runnable, Duration) have very different semantics. It is also very different to the CyclicTimeout.schedule(Duration) method, as there is one and only one expiry for each non-cancelled call to schedule

The problem is that is very hard to write the javadoc of exactly when CT.schedule(Expirable) must be called.

Currently CT.schedule(Expirable) MUST be called when:

This can be simplified to CT.schedule(Expirable) SHOULD be called when:

But that results in calling CT.schedule(Expirable) from within onExpired(Expirable), which I was doing in my PR and you pointed out that it is not necessary. So perhaps it can be documented as CT.schedule(Expirable) SHOULD be called when:

@sbordet @lorban

What about renaming schedule(Expirable) to update(Expirable) and include an assert in the implementation:


    /**
     * Update the internal schedule of this {@code CyclicTimeouts} if the passed {@link Expirable} is the earliest expiring element of the {@link #iterator()}.
     * Should be called whenever adding a new element or mutating the expiry time outside of the scope of {@link Expirable#onExpired(Expirable)}.
     */
    public void update(T expirable)
    {
        assert StreamSupport.stream(((Iterable<T>)this::iterator).spliterator(), false).anyMatch(e -> e.equals(expirable));