celery / celery

Distributed Task Queue (development branch)
https://docs.celeryq.dev
Other
24.99k stars 4.69k forks source link

Celery v5.5.0 Release #9140

Open Nusnus opened 4 months ago

Nusnus commented 4 months ago

Minor Release Overview: v5.5.0

This issue will summarize the status and discussion in preparation for the new release. It will be used to track the progress of the release and to ensure that all the necessary steps are taken. It will serve as a checklist for the release and will be used to communicate the status of the release to the community.

⚠️ Warning: The release checklist is a living document. It will be updated as the release progresses. Please check back often to ensure that you are up to date with the latest information.

Checklist

Release Details

The release manager is responsible for completing the release end-to-end ensuring that all the necessary steps are taken and that the release is completed in a timely manner. This is usually the owner of the release issue but may be assigned to a different maintainer if necessary.

Release Steps

The release manager is expected to execute the checklist below. The release manager is also responsible for ensuring that the checklist is updated as the release progresses. Any changes or issues should be communicated under this issue for centralized tracking.

Potential Release Blockers

1. Codebase Stability

2. Breaking Changes Validation

A patch release should not contain any breaking changes. The release manager is responsible for reviewing all of the merged PRs since the last release to ensure that there are no breaking changes. If there are any breaking changes, the release manager should discuss with the maintainers to determine the best course of action if an obvious solution is not apparent.

3. Compile Changelog

The release changelog is set in two different places:

  1. The Changelog.rst that uses the RST format.
  2. The GitHub Release auto-generated changelog that uses the Markdown format. This is auto-generated by the GitHub Draft Release UI.

⚠️ Warning: The pre-commit changes should not be included in the changelog.

To generate the changelog automatically, draft a new release on GitHub using a fake new version tag for the automatic changelog generation. Notice the actual tag creation is done on publish so we can use that to generate the changelog and then delete the draft release without publishing it thus avoiding creating a new tag.

3.1 Changelog.rst

Once you have the actual changes, you need to convert it to rst format and add it to the Changelog.rst file. The new version block needs to follow the following format:

.. _version-x.y.z:

x.y.z
=====

:release-date: YYYY-MM-DD HH:MM P.M/A.M TimeZone
:release-by: Release Manager Name

Changes list in RST format.

These changes will reflect in the Change history section of the documentation.

3.2 Changelog PR

The changes to the Changelog.rst file should be submitted as a PR. This will PR should be the last merged PR before the release.

4. Release

4.1 Prepare releasing environment

Before moving forward with the release, the release manager should ensure that bumpversion and twine are installed. These are required to publish the release.

4.2 Bump version

The release manager should bump the version using the following command:

bumpversion patch

The changes should be pushed directly to main by the release manager.

At this point, the git log should appear somewhat similar to this:

commit XXX (HEAD -> main, tag: vX.Y.Z, upstream/main, origin/main)
Author: Release Manager
Date:   YYY

    Bump version: a.b.c → x.y.z

commit XXX
Author: Release Manager
Date:   YYY

    Added changelog for vX.Y.Z (#1234)

If everything looks good, the bump version commit can be directly pushed to main:

git push origin main --tags

4.3 Publish release to PyPI

The release manager should publish the release to PyPI using the following commands running under the root directory of the repository:

python setup.py clean build sdist bdist_wheel

If the build is successful, the release manager should publish the release to PyPI using the following command:

twine upload dist/celery-X.Y.Z*

⚠️ Warning: The release manager should double check that the release details are correct (project/version) before publishing the release to PyPI.

⚠️ Critical Reminder: Should the released package prove to be faulty or need retraction for any reason, do not delete it from PyPI. The appropriate course of action is to "yank" the release.

Release Announcement

After the release is published, the release manager should create a new GitHub Release and set it as the latest release.

CleanShot 2023-09-05 at 22 51 24@2x

Add Release Notes

On a per-case basis, the release manager may also attach an additional release note to the auto-generated release notes. This is usually done when there are important changes that are not reflected in the auto-generated release notes.

OpenCollective Update

After successfully publishing the new release, the release manager is responsible for announcing it on the project's OpenCollective page. This is to engage with the community and keep backers and sponsors in the loop.

Nusnus commented 4 months ago

Celery v5.5.0b1 released.

JockeTF commented 4 months ago

I've been playing around with quorum queues, and they seem really nice! I had set up a small RabbitMQ 3.13 cluster using a simplified (upgraded, without the entrypoint script, and with manual cluster joining) version of serkodev/rabbitmq-cluster-docker. I threw 65 536 tasks at it over the cause of an hour while this little thing was running.

while true; do
  for i in {1..3}; do
    docker kill "rabbitmq$i"
    sleep 30
    docker start "rabbitmq$i"
    sleep 30
  done
done

It seems 16 tasks were lost from two of the kills (I don't know why yet), but the system remained stable. The worst thing I saw beyond that was a handful of our response times slowing down by half a second after a kill. I also noticed that non-default queues for tasks were being created as classic queues instead of quorum queues.

Thanks for this release! It is looking pretty great so far!

Nusnus commented 3 months ago

Celery v5.5.0b1 released.

Celery v5.5.0b2 released.

connorlwilkes commented 3 months ago

What is the current ETA for this release? We are hitting the stuck worker issue with Airflow and celery and would be interested in upgrading to fix that

Nusnus commented 3 months ago

@connorlwilkes

What is the current ETA for this release? We are hitting the stuck worker issue with Airflow and celery and would be interested in upgrading to fix that

Around a few weeks (~4). There's a lot of activity lately so we want all of the new contributions that are being worked on right now ready for the next beta release.

That being said, can you check how Airflow handles the current beta release please? It would be a very valuable feedback 🙏

Thanks!

Nusnus commented 2 months ago

Celery v5.5.0b1 released.

Celery v5.5.0b2 released.

Celery v5.5.0b3 released.

Nusnus commented 2 months ago

Celery v5.5.0b4 will be released this week. RC -> Final will be done during October. The due date for Python 3.8/3.13 is the 1st of October, and we will try to support both in v5.5.

Nusnus commented 2 months ago

Celery v5.5.0b1 released.

Celery v5.5.0b2 released.

Celery v5.5.0b3 released.

Celery v5.5.0b4 released.

Nusnus commented 2 months ago

Celery v5.5.0b4 will be released this week. RC -> Final will be done during October. The due date for Python 3.8/3.13 is the 1st of October, and we will try to support both in v5.5.

Unless we receive a significant enough contribution, these are the estimated deadlines for the subsequent release cycles:

So v5.5.0 should be released between the 20th and 26th of October.

That being said, if there will be only very minor changes after RC1 and it won’t make sense to go with RC2, we will consider RC1 to be stable enough and release the last few changes (e.g., English typo fixes in the docs) with the final release, and skip RC2 altogether. The deadline for the final release will remain the same though, and RC1 will have more “air time” instead.

Nusnus commented 2 months ago

What is the current ETA for this release? We are hitting the stuck worker issue with Airflow and celery and would be interested in upgrading to fix that

@connorlwilkes FYI my last new updates above

Nusnus commented 1 month ago

Celery v5.5.0b1 released.

Celery v5.5.0b2 released.

Celery v5.5.0b3 released.

Celery v5.5.0b4 released.

Celery v5.5.0rc1 released.

savvym commented 1 month ago

Error: AttributeError("'gevent._gevent_cgreenlet.Greenlet' object has no attribute 'terminate'") An error occurs when I terminate a task while using gevent as the concurrency pool,although it doesn't affect the final effect

[2024-10-09 22:02:56,726: ERROR/MainProcess] pidbox command error: AttributeError("'gevent._gevent_cgreenlet.Greenlet' object has no attribute 'terminate'")
Traceback (most recent call last):
  File "/home/savvym/venv/lib/python3.12/site-packages/kombu/pidbox.py", line 102, in dispatch
    reply = handle(method, arguments)
            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/savvym/venv/lib/python3.12/site-packages/kombu/pidbox.py", line 124, in handle_cast
    return self.handle(method, arguments)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/savvym/venv/lib/python3.12/site-packages/kombu/pidbox.py", line 118, in handle
    return self.handlers[method](self.state, **arguments)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/savvym/venv/lib/python3.12/site-packages/celery/worker/control.py", line 149, in revoke
    task_ids = _revoke(state, task_ids, terminate, signal, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/savvym/venv/lib/python3.12/site-packages/celery/worker/control.py", line 224, in _revoke
    request.terminate(state.consumer.pool, signal=signum)
  File "/home/savvym/venv/lib/python3.12/site-packages/celery/worker/request.py", line 423, in terminate
    obj.terminate(signal)
    ^^^^^^^^^^^^^
AttributeError: 'gevent._gevent_cgreenlet.Greenlet' object has no attribute 'terminate'

Reproduction

celery_app.py

from celery import Celery

app = Celery('tasks', 
             broker='redis://127.0.0.1:6379/0')  

app.conf.update(
    task_serializer='json',
    accept_content=['json'],
    result_serializer='json',
    timezone='UTC',
    enable_utc=True,
)

tasks.py

from celery_app import app
import time

@app.task(bind=True)
def long_running_task(self):
    try:
        for i in range(100):
            print(f'Working on {i}...')
    except Exception as e:
        return str(e)
    return 'Task completed'

terminate_task.py

from celery_app import app
from tasks import long_running_task
import time

if __name__ == '__main__':
    result = long_running_task.apply_async()
    time.sleep(5)
    result.revoke(terminate=True, signal='SIGTERM')
    print('Task terminated request sent.')

When i start celery as follow:
celery -A tasks worker -P gevent -l INFO
I terminate this task by run: python terminate_task.py
This error will occur.

Reason

After debug celery, I found that when use gevent to terminate tasks, pool.terminate_job() kill greenlet asynchronously, then the obj = self._apply_result() will return a Greenlet with dead flag and Greenlet has no terminate method.

# celery/worker/request.py:416
def terminate(self, pool, signal=None):
    signal = _signals.signum(signal or TERM_SIGNAME)
    if self.time_start:
        pool.terminate_job(self.worker_pid, signal)
        self._announce_revoked('terminated', True, signal, False)
    else:
        self._terminate_on_ack = pool, signal
    if self._apply_result is not None:
        obj = self._apply_result()  # is a weakref
        if obj is not None:
            obj.terminate(signal)

Solution

Maybe we can add some judgment to this part to prevent this exception like this:

# celery/worker/request.py
from gevent._gevent_cgreenlet import Greenlet

def terminate(self, pool, signal=None):
    signal = _signals.signum(signal or TERM_SIGNAME)
    if self.time_start:
        pool.terminate_job(self.worker_pid, signal)
        self._announce_revoked('terminated', True, signal, False)
    else:
        self._terminate_on_ack = pool, signal
    if self._apply_result is not None:
        obj = self._apply_result()  # is a weakref
        if obj is not None:
            if isinstance(obj, Greenlet) and obj.dead: # is a greenlet with dead flag
                return
            obj.terminate(signal)
Nusnus commented 1 month ago

Celery v5.5.0b4 will be released this week. RC -> Final will be done during October. The due date for Python 3.8/3.13 is the 1st of October, and we will try to support both in v5.5.

Unless we receive a significant enough contribution, these are the estimated deadlines for the subsequent release cycles:

  • Release Candidate 1: Between 6th and 12th of October (next week).
  • Release Candidate 2: ~1 week afterward (Mid-October).
  • Final Release: ~1 week afterward.

So v5.5.0 should be released between the 20th and 26th of October.

That being said, if there will be only very minor changes after RC1 and it won’t make sense to go with RC2, we will consider RC1 to be stable enough and release the last few changes (e.g., English typo fixes in the docs) with the final release, and skip RC2 altogether. The deadline for the final release will remain the same though, and RC1 will have more “air time” instead.

RC2 will be delayed to next week. Considering adding RC3, will update next week. This will push the final release to the very end of October.

The motivation is to allow contributors enough space to finalize the current open issues so we can release the next pre-release without any known issues.

Nusnus commented 1 month ago

The motivation is to allow contributors enough space to finalize the current open issues so we can release the next pre-release without any known issues.

Few of these are:

And hopefully but optionally,

Nusnus commented 1 month ago

The motivation is to allow contributors enough space to finalize the current open issues so we can release the next pre-release without any known issues.

Few of these are:

And hopefully but optionally,

@thedrow @auvipy please prioritize your attention to the above issues so I can continue with the release procedure. Also notice my previous comment.

Thanks!

vachillo commented 1 month ago

let me know if this should be logged elsewhere. there is a discussion with the same issue/tech stack described.

we are still seeing the same redis connection issues with the latest rc1 release. The celery worker fails to reconnect to redis after raising SERVER_CLOSED_CONNECTION_ERROR and it doesn't seem to reconnect.

python 3.11 celery 5.5.0rc1 kombu 5.4.2

trace:

Traceback (most recent call last):
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/celery/worker/consumer/consumer.py\", line 340, in start
    blueprint.start(self)
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/celery/bootsteps.py\", line 116, in start
    step.start(parent)
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/celery/worker/consumer/consumer.py\", line 759, in start
    c.loop(*c.loop_args())
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/celery/worker/loops.py\", line 97, in asynloop
    next(loop)
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py\", line 308, in create_loop
    poll_timeout = fire_timers(propagate=propagate) if scheduled else 1
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/asynchronous/hub.py\", line 149, in fire_timers
    entry()
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/asynchronous/timer.py\", line 70, in __call__
    return self.fun(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/asynchronous/timer.py\", line 137, in _reschedules
    return fun(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/transport/redis.py\", line 554, in maybe_restore_messages
    return channel.qos.restore_visible(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/transport/redis.py\", line 410, in restore_visible
    with Mutex(client, self.unacked_mutex_key,
  File \"/usr/local/lib/python3.11/contextlib.py\", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/kombu/transport/redis.py\", line 166, in Mutex
    lock_acquired = lock.acquire(blocking=False)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/lock.py\", line 210, in acquire
    if self.do_acquire(token):
       ^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/lock.py\", line 226, in do_acquire
    if self.redis.set(self.name, token, nx=True, px=timeout):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/commands/core.py\", line 2333, in set
    return self.execute_command(\"SET\", *pieces, **options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/ddtrace/contrib/redis/patch.py\", line 153, in _instrumented_execute_command
    return _run_redis_command(ctx=ctx, func=func, args=args, kwargs=kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/ddtrace/contrib/redis/patch.py\", line 130, in _run_redis_command
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/client.py\", line 548, in execute_command
    return conn.retry.call_with_retry(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/retry.py\", line 65, in call_with_retry
    fail(error)
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/client.py\", line 552, in <lambda>
    lambda error: self._disconnect_raise(conn, error),
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/client.py\", line 538, in _disconnect_raise
    raise error
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/retry.py\", line 62, in call_with_retry
    return do()
           ^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/client.py\", line 549, in <lambda>
    lambda: self._send_command_parse_response(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/client.py\", line 524, in _send_command_parse_response
    conn.send_command(*args)
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/connection.py\", line 476, in send_command
    self.send_packed_command(
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/connection.py\", line 449, in send_packed_command
    self.check_health()
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/connection.py\", line 441, in check_health
    self.retry.call_with_retry(self._send_ping, self._ping_failed)
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/retry.py\", line 67, in call_with_retry
    raise error
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/retry.py\", line 62, in call_with_retry
    return do()
           ^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/connection.py\", line 431, in _send_ping
    if str_if_bytes(self.read_response()) != \"PONG\":
                    ^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/connection.py\", line 512, in read_response
    response = self._parser.read_response(disable_decoding=disable_decoding)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/_parsers/resp2.py\", line 15, in read_response
    result = self._read_response(disable_decoding=disable_decoding)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/_parsers/resp2.py\", line 25, in _read_response
    raw = self._buffer.readline()
          ^^^^^^^^^^^^^^^^^^^^^^^
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/_parsers/socket.py\", line 115, in readline
    self._read_from_socket()
  File \"/opt/webapp/api/.venv/lib/python3.11/site-packages/redis/_parsers/socket.py\", line 68, in _read_from_socket
    raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)
redis.exceptions.ConnectionError: Connection closed by server.
Nusnus commented 1 month ago

The motivation is to allow contributors enough space to finalize the current open issues so we can release the next pre-release without any known issues.

Few of these are:

And hopefully but optionally,

Unfortunately, the remaining issues (above and more) were not closed before the expected deadline, so we’ll postpone the release date accordingly.

Fortunately, this was completed on time:

Remaining Steps/Issues:

  1. Kombu: https://github.com/celery/kombu/pull/2167
  2. Kombu: Release v5.5.0rc2
  3. Celery: https://github.com/celery/celery/pull/9374
  4. Celery: https://github.com/celery/celery/pull/9207
  5. Celery: Release v5.5.0rc2
  6. Fix critical issues only.
  7. Kombu: Release v5.5.0
  8. Celery: Release v5.5.0

Estimated New Release Date

2nd week of November.

Nusnus commented 1 month ago

Kombu: Release v5.5.0rc2

@thedrow please check current main of Kombu againsts the latest main of Celery and run all of the unit, integration and smoke tests. Once you confirm here that our both main branches play along, I’ll release Kombu v5.5.0rc2 and complete my own release checklist to make sure the version is viable for the next RC pre-release.

Once this is done, you (@thedrow) can update #9207 and we can focus on getting it merged so I can complete the rest of the release steps.

@auvipy FYI

Thanks! 🙌

Nusnus commented 1 month ago

@Nusnus All tests are passing. You can release 5.5.0rc2.

Thank you! Done.

I also found a new bug in Celery main/rc1 during my release checklist, compared to v5.4.0. See #9383 for more info.

EDIT: Finished my checklist and found another bug: #9385

Nusnus commented 1 month ago

Remaining Release Blockers (29 Oct 2024)

  1. Celery: https://github.com/celery/celery/pull/9383
  2. Celery: https://github.com/celery/celery/issues/9385
  3. Celery: https://github.com/celery/celery/pull/9207
  4. Celery: Release v5.5.0rc2
  5. Fix critical issues only.
  6. Kombu: Release v5.5.0
  7. Celery: Release v5.5.0

Estimated New Release Date

2nd 4th week of November.

@thedrow @auvipy FYI

Nusnus commented 2 weeks ago

Remaining Release Blockers (29 Oct 2024)

  1. Celery: Missing CPendingDeprecationWarning from Celery v5.4.0 #9383
  2. Celery: Bug: Log "Global QoS is disabled. Prefetch count in now static.” appears for non-amqp transports #9385
  3. Celery: Native Delayed Delivery in RabbitMQ #9207
  4. Celery: Release v5.5.0rc2
  5. Fix critical issues only.
  6. Kombu: Release v5.5.0
  7. Celery: Release v5.5.0

Estimated New Release Date

~2nd~ 4th week of November.

@thedrow @auvipy FYI

All of the release blockers are gone!! Woohoo! 🥳

New release schedule

@thedrow @auvipy I want to declare a complete code freeze from tomorrow’s RC2 release except for bug/doc fixes. The two weeks between RC2 and the release should be strictly reviewed (for both Kombu and Celery). Anything noncritical should be postponed to the next version using the v5.5.1 milestone. Dependency upgrades can only be accepted for patch releases, etc.

If there’s any exception, I’d appreciate it if you could tag me for review so I can be on top of the entire release process. This is one hell of a version :)

Nusnus commented 2 weeks ago

Celery v5.5.0b1 released.

Celery v5.5.0b2 released.

Celery v5.5.0b3 released.

Celery v5.5.0b4 released.

Celery v5.5.0rc1 released.

Celery v5.5.0rc2 released.

Nusnus commented 3 days ago

Due to a new release blocker (#9433), a few more minor issues that need to be resolved and being unavailable for a few days to fix the blocker, I am postponing the release by another week, from 2nd to 9th of Dec.

@thedrow @auvipy FYI

Nusnus commented 9 hours ago

Celery v5.5.0b1 released.

Celery v5.5.0b2 released.

Celery v5.5.0b3 released.

Celery v5.5.0b4 released.

Celery v5.5.0rc1 released.

Celery v5.5.0rc2 released.

Celery v5.5.0rc3 released.