Closed e17ae51f-bf3c-4ea4-be2c-67503164f70f closed 15 years ago
Python's garbage collector holds GIL during collection and doesn't provide any method of interruption or concurrency with other Python threads within a single Python VM. This can be a problem for realtime applications. The worst-case performance of the garbage collector takes linear time with respect to the number of Python objects that could potentially be involved in a garbage cycle. I've attached timings taken on a Core 2 Quad 2.4 GHz (WinXP Pro, 3GB RAM), with ever-increasing numbers of objects. The gc at worst takes upwards of 3 seconds before the process runs out of memory.
If gc periodically released the GIL it would allow it to be put in a separate thread, but as it stands it blocks the Python VM for periods of time that are too long for realtime interactive applications. Alternatively a gc that is incremental by default would eliminate the need for a second thread.
The garbage collector will never be able to run in a second thread because it manipulates Python objects, which the GIL is supposed to protect!
As for non-linear complexity, see bpo-4688 and bpo-4074 for some attempts to sooth this problem over.
A 'stop-the-world' garbage collector that periodically released the GIL could be run in a second thread, allowing the main thread to break in and do some processing. However the nature of a stop-the-world collector means that it probably would not easily be able to deal with changes made by other threads in the middle of the collect.
My concern is that the Python process blocks and is unresponsive due to garbage collection for periods of time that are not acceptable for realtime interactive applications. Are there any plans to add an incremental collector to Python?
Hard real-time applications written in Python should not rely on the cyclic garbage collector. They should call gc.disable at startup, and completely rely on reference counting for releasing memory. Doing so might require rewrites to the application, explicitly breaking cycles so that reference counting can release them.
Even with cyclic gc disabled, applications worried about meeting hard deadlines need to look even more into memory allocation and deallocation; e.g. releasing a single object may cause a chained release of many objects, which can affect worst case execution times. There are more issues to consider (which are all out of scope of the bug tracker).
OK cool, that's the development strategy we've already adopted. Is this limitation of Python's garbage collector in relation to real-time applications documented anywhere?
OK cool, that's the development strategy we've already adopted. Is this limitation of Python's garbage collector in relation to real-time applications documented anywhere?
Why do you ask? (this is OT for the bug tracker) It's not in the Python documentation. However, any good book on real-time systems will tell you that garbage collection is a problem.
I ask because in my opinion a three-second pause on a modern machine is significant for any program with any sort of interactivity--significant enough to warrant a warning in the documentation. Python is a great language and I think it deserves an incremental implementation of garbage collection.
Python is a great language and I think it deserves an incremental implementation of garbage collection.
Python's cyclic garbage collector is incremental. If you can provide a specific patch to replace it with something "better", please submit it as a separate issue to the tracker. If you can hire somebody to implement such a thing, go right ahead. Otherwise, I think there is little that can be done. Python gets all of its contributions from volunteers.
Regardless of the type of algorithm used by the garbage collector that currently exists in Python, its worst-case performance is undesirable. I have some interest in implementing a garbage collector for Python with improved performance characteristics but don't really have the time for it. Hopefully someone does. In any case thanks for hearing me out.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['interpreter-core', 'performance']
title = 'garbage collector blocks and takes worst-case linear time wrt number of objects'
updated_at =
user = 'https://bugs.python.org/darrenr'
```
bugs.python.org fields:
```python
activity =
actor = 'darrenr'
assignee = 'none'
closed = True
closed_date =
closer = 'benjamin.peterson'
components = ['Interpreter Core']
creation =
creator = 'darrenr'
dependencies = []
files = ['12510']
hgrepos = []
issue_num = 4794
keywords = []
message_count = 9.0
messages = ['78646', '78649', '78662', '78825', '78926', '78935', '79186', '79189', '79277']
nosy_count = 4.0
nosy_names = ['loewis', 'benjamin.peterson', 'LambertDW', 'darrenr']
pr_nums = []
priority = 'normal'
resolution = 'duplicate'
stage = None
status = 'closed'
superseder = None
type = 'resource usage'
url = 'https://bugs.python.org/issue4794'
versions = ['Python 2.6', 'Python 2.4', 'Python 3.0']
```