Closed aboutaaron closed 10 years ago
Also worth checking: http://blog.gingerlime.com/2011/django-memory-leaks-part-ii/
Also worth mentioning, it looks like the commands finished for me without any problems.
I'm using gc.collect() I'm doing django.db.reset_queries() And now I'm iterating over querysets using this snippet: https://djangosnippets.org/snippets/1949/ Not sure why my process gets killed I just started running it again. Fingers crossed. Glad it worked for you though Aaron.
I think you solved this @armendariz. Feel free to reopen the issue if you think otherwise.
@aboutaaron Thanks! It is solved. Slowly figuring out the Github thing
Problem
According to @armendariz, the management command takes forever to run and will terminate when it hits about 4 million records. So, we need to find a way to reduce the bottle neck.
Hypothesis
My initial thought is that the management command runs several functions in succession that shove a ton of data into MySQL via the Django ORM. As a result, these processes are costly an end up eating a ton of memory since MySQL, Django and python are firing all cylinders a once.
Solutions
Python is supposed to be garbage collected automatically, but perhaps the functions aren't doing a decent job at that. One hacky way to get around this is to use
gc
to for python to release objects from memory once the function ends:More on StackOverflow: How can I explicitly free memory in Python?.
This may be incorrect, but this was my initial thought.