boto / botocore

The low-level, core functionality of boto3 and the AWS CLI.
Apache License 2.0
1.44k stars 1.06k forks source link

Memory leak tests use substantially more memory on Python 3.13 #3205

Open AdamWill opened 1 week ago

AdamWill commented 1 week ago

Describe the bug

We're landing Python 3.13 in Fedora Rawhide. As part of this, awscli2 (which has a vendored botocore) was rebuilt, but it failed because the memory leak tests for the vendored botocore failed. (The memory leak tests are disabled on the botocore package for unrelated reasons, so we didn't see this there). On further investigation, it seems memory isn't leaking exactly, but the 'ceiling' memory consumption is much higher than on Python 3.12 - ~11MiB vs. ~1.6MiB, more than 5 times higher.

Expected Behavior

The tests in leak/test_resource_leaks.py should pass.

Current Behavior

The test_create_single_waiter_memory_constant and test_create_single_paginator_memory_constant tests failed, because they saw more than 10MiB of memory usage, the ceiling configured in the test code.

Reproduction Steps

Run the leak/test_resource_leaks.py tests on Python 3.13. I'm fairly sure it's Python 3.13 that causes the issue, because I ran the same builds and tests against a Fedora Rawhide environment from two days ago - just before Python 3.13 landed - and saw substantially lower memory usage. Python 3.13 and associated rebuilds should be the only significant difference there.

Possible Solution

No response

Additional Information/Context

I hacked up the tests a bit so I could pass in the number of runs in each of the two failing tests (which is normally 100) as an environment variable, and it would report the memory usage. Here's what I found:

Python 3.12

   20 tries - 1048576
  100 tries - 1703936
 1000 tries - 1835008
10000 tries - 1835008

Python 3.13

   20 tries - 2883584
   50 tries - 7340032
  100 tries - 10616832
 1000 tries - 11272192
10000 tries - 11272192

I initially ran the tests with awscli2's vendored botocore, which is quite old, but ripping that out and replacing it with 1.34.121 or 1.34.125 makes no difference, I still see 11272192 bytes used at 10,000 tries. So we level out at 11272192 bytes on Python 3.13 but 1835008 on Python 3.12 - that's 6.14x higher on 3.13.

I'll try and run this through a profiler so we can see what's actually using the additional memory, but I can't immediately because memray has various deps that failed the Python 3.13 rebuild 😅 I figured it was worth flagging up, at least, even if it's not fixable or the issue ultimately lies outside of botocore. For practical purposes in Fedora I will patch the ceiling of the test to 20 MiB, since clearly we're not really leaking memory here, just hitting a higher ceiling than the test expects.

SDK version used

1.34.125

Environment details (OS name and version, etc.)

Fedora Rawhide

tim-finnigan commented 1 week ago

Thanks for reaching out. The team is aware of the increased memory usage in Python 3.13, and is tracking discussion around testing here: https://github.com/boto/botocore/pull/3185. There is a strong possibility that there is a bug in 3.13 that is causing the 5-10x memory usage increase you observed. The team will continue researching and likely report their findings to the Python maintainers.