ifduyue / xxtea

Python extension module xxtea
https://pypi.org/project/xxtea/
BSD 2-Clause "Simplified" License
29 stars 11 forks source link

Massiv CPU load on heavy usage sites #6

Closed SimonSteinberger closed 8 years ago

SimonSteinberger commented 9 years ago

Used on a Django site powered by NGINX/uWSGI with massive traffic (> 60.000 requests per second), the new C-based XXTEA version caused an extremely high server load. CPUs were constantly kept at nearly 100% load. By reverting to the non-C-version of XXTEA, the problem was solved.

Not being a C developer, I cannot debug the XXTEA code. But this is basically the error from our server log. I hope it helps tracking down the culprit:

!!! uWSGI process 16809 got Segmentation Fault !!!

We kept getting lots of "segmentation errors", which pointed to incorrect memory allocation in a C library. Without having any clue of what the C code does exactly, I simply guess, XXTEA might not be thread safe as it is.

Let me know if I can give you some more information concerning the error(s).

ifduyue commented 9 years ago

Thanks for reporting. I will look into this.

ifduyue commented 9 years ago

It's weird that the pure python implementation works OK. I benched the pure python implementation and the C extension module before, and the results told me that the C version was faster. However, I didn't measure the CPU utilization.

ifduyue commented 9 years ago

About the segmentation errors, could you send me the list of the packages required by the Django site (pip freeze). If it's sensitive, send me an email in private.

ifduyue commented 9 years ago

It's confirmed that there are memory leaks. Trying to solve it..

SimonSteinberger commented 9 years ago

I'm not sure if what we observed it was caused by a memory leak ... Anyways, here our list of installed packages - pretty long :-P

Djalog==0.9.4 Django==1.7.4 MarkupSafe==0.23 Pillow==2.7.0 UniConvertor==1.1.4 argparse==1.2.1 chardet==2.0.1 colorama==0.2.5 decorator==3.4.0 django-jinja==1.1.1 django-ratelimit==0.5.0 docopt==0.6.2 duplicity==0.6.23 html5lib==0.999 ipython==1.2.1 lockfile==0.8 lxml==3.4.2 numpy==1.8.2 pexpect==3.1 polib==1.0.4 psd-tools==1.2 psycopg2==2.6 pyenchant==1.6.5 python-apt==0.9.3.5ubuntu1 python-debian==0.1.21-nmu2ubuntu2 python-memcached==1.53 reportlab==3.0 requests==2.2.1 simplegeneric==0.8.1 six==1.5.2 ssh-import-id==3.21 uWSGI==2.0.9 ujson==1.33 urllib3==1.7.1 wsgiref==0.1.2

ifduyue commented 9 years ago

I have fixed some memory leaks and made a new release v0.2.1. Please try and see if the problem still exists.

SimonSteinberger commented 9 years ago

We've been running it now for about one hour without seeing any problems. So, it looks very good and we keep it running - if we notice anything, I'll let you know!

Thanks for fixing the issue!!

SimonSteinberger commented 8 years ago

Hi Yue Du,

I'm afraid, the bug may possibly never have been fixed. We've updated our system today, made some changes, and bammm, the issue was back again. Well, it turns out, due to an accident, we were importanting the pure Python instead of the C-version.

So, here we go again:

!!! uWSGI process 11807 got Segmentation Fault !!!

Within seconds, our server crashes with an incredible lot of segmentation faults. The same issue was reported on ServerFault a while back: http://serverfault.com/questions/670380/uwsgi-backtrace-semgentation-fault-errors

We would be soo, sooo grateful if you would be able to fix this. Not being a C developer, I have no clue where to start ...

Best, Simon

ifduyue commented 8 years ago

OK, I'll look into it.

ifduyue commented 8 years ago

Hi, Simon

Can you provide a single code snippet to reproduce the segmentation fault? If so, would be easier to locate the bug.

ifduyue commented 8 years ago

And which version of Python is running on your production server?

SimonSteinberger commented 8 years ago

It's a Django app running on Python 2.7.6. The error doesn't occur on development systems - at least I never noticed. We're only using encrypt_hex and decrypt_hex. But we have a lot of running requests per second. ... My pure guess is, it has something to do with simultaneously running requests.

ifduyue commented 8 years ago

Using Python 2.7.5 with single thread and multi threads, still can't reproduce.

ifduyue commented 8 years ago

Going to test it in an uwsgi environment.

SimonSteinberger commented 8 years ago

Great - thanks for looking into it. I wish I could help, but C is another world compared to Python :)

ifduyue commented 8 years ago

https://github.com/ifduyue/nginx-uwsgi-django-xxtea-test Here is the test project, but still without luck.

ifduyue commented 8 years ago

@SimonSteinberger I think I've found the problem. It happens when decrypt some invalid input.

SimonSteinberger commented 8 years ago

It's definitely a possibility, although I'd guess the number of invalid decrypt strings on our app should be rather small.

ifduyue commented 8 years ago

Bug located, will fix it tomorrow. Going to sleep right now.

ifduyue commented 8 years ago

Fixed in v1.0: https://pypi.python.org/pypi/xxtea/1.0

ifduyue commented 8 years ago

@SimonSteinberger Please notice that the pure python version and the C extension version are not compatible with each other. If you've saved any results produced by the pure python version, you would want to make a migration.

SimonSteinberger commented 8 years ago

Thanks ifduyue :+1: