haiwen / seafile

High performance file syncing and sharing, with also Markdown WYSIWYG editing, Wiki, file label and other knowledge management features.
http://seafile.com/
Other
12.25k stars 1.54k forks source link

Seafile 6.0.4, 6.0.5 and 6.0.6, Ubuntu 14.04 and Apache 2.4 Uses all Memory #1778

Closed caniwi closed 7 years ago

caniwi commented 7 years ago

We have successfully used seafile for a number of releases, but on upgrading to 6.0.4 found on our production server that seaf-server processes seemed to spawn infinitely and use up all the RAM on our virtual server within a couple of hours. We have reverted back to 5.1.4 and the RAM usage is back to normal. The virtual server has 2CPUs and 4Gig of RAM. We have about 160G in seafile storage. Some users were also complaining of not being able to see files in the Web that other people had uploaded to a RW share.

killing commented 7 years ago

It may be related to Postgresql support. In 6.0 release we changed our database code (not using libzdb anymore). The Postgresql support is less tested and may have bug. Could you post controler.log and seafile.log?

caniwi commented 7 years ago

Hi ... please find attached. Please let me know if you need anything else.

seafile.log.3.gz extract_controller.log.txt

seafile.log.4.gz

killing commented 7 years ago

I cannot see anything unusual from the logs, except that the PG client library fails to allocate memory. Can you tell which processes use the most memory?

caniwi commented 7 years ago

It looks like seaf-server from htop. Postgres failed to allocate memory because all the memory was consumed.

caniwi commented 7 years ago

I have just installed 6.0.5 and am wondering if I can expect any different behaviour.

caniwi commented 7 years ago

I think I have the answer already...... nope. Attached is a memory graph that show seafile-server 6.0.5 going in at 10:43. Prior to that we were running 5.1.4 which was very stable. cat-prod-seafile catalyst net nz memory usage

I have also collated some process stats.which seem to show postgres on the seafile db increasing the most psauxsortedonrss.txt

and a top -c sorted m from just before I reverted back to seafile-server 5.1.4. Does this help you any? top-csortm.txt

caniwi commented 7 years ago

@killing: you mention you are moving away from libzdb, can you tell me what you are using instead? I am still interested in debugging this issue.

killing commented 7 years ago

Hi, We write the database layer by ourselves instead. You can find the code here: https://github.com/haiwen/seafile-server/blob/master/common/db-wrapper/pgsql-db-ops.c

I'll also debug this problem in a few days when time permitted.

caniwi commented 7 years ago

Hi, Seems like a big ask, writing your own database layer but if you are up for it..... Do you need me to gather any information to assist your debugging?

lins05 commented 7 years ago

writing your own database layer but if you are up for it

We still use the libmysql/libpg etc., only implemented a wrapper layer to hide the difference of different libraries (e.g. seaf_db_commit calls either mysql_commit or pg_commit depending on the current db in use), to simply the application code.

caniwi commented 7 years ago

I upgraded to 6.0.6 as there was mention of a fix for postgres in the change log and was hopeful it might have fixed this memory leak. Unfortunately this is not the case as per the graph. Seafile 5.1.4 was restarted at approx 22:30 and the upgrade to Seafile 6.0.6 at approx 08:30. If you have any particular diagnostics you would like me to capture then let me know. Once again, it appears to be PostgreSQL related. seafile memory usage

caniwi commented 7 years ago

Some memory stats....

seafile:~#` ps aux --sort=-%mem | awk 'NR<=10{print $0}
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 28007  0.2 15.7 876876 636988 ?       Ss   08:21   0:34 postgres: seafile seafile_db 127.0.0.1(34392) idle                                                                          
postgres 28003  0.0  6.2 490724 254356 ?       Ss   08:21   0:07 postgres: seafile ccnet_db 127.0.0.1(34390) idle                                                                            
root     28781  0.0  1.5 194736 64684 ?        Ssl  08:23   0:01 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/puppet agent
seafile   2428  0.1  1.5 207836 62896 ?        S    09:29   0:17 python2.7 /data/seafile/seafile-server-6.0.6/seahub/manage.py runfcgi host=127.0.0.1 port=8000 pidfile=/data/seafile/seafile-server-6.0.6/runtime/seahub.pid outlog=/data/seafile/seafile-server-6.0.6/runtime/access.log errlog=/data/seafile/seafile-server-6.0.6/runtime/error.log
seafile   2569  0.1  1.5 206932 62572 ?        S    09:32   0:17 python2.7 /data/seafile/seafile-server-6.0.6/seahub/manage.py runfcgi host=127.0.0.1 port=8000 pidfile=/data/seafile/seafile-server-6.0.6/runtime/seahub.pid outlog=/data/seafile/seafile-server-6.0.6/runtime/access.log errlog=/data/seafile/seafile-server-6.0.6/runtime/error.log
seafile   1489  0.1  1.5 206612 61256 ?        S    09:22   0:17 python2.7 /data/seafile/seafile-server-6.0.6/seahub/manage.py runfcgi host=127.0.0.1 port=8000 pidfile=/data/seafile/seafile-server-6.0.6/runtime/seahub.pid outlog=/data/seafile/seafile-server-6.0.6/runtime/access.log errlog=/data/seafile/seafile-server-6.0.6/runtime/error.log
seafile   2536  0.1  1.5 205672 61044 ?        S    09:31   0:19 python2.7 /data/seafile/seafile-server-6.0.6/seahub/manage.py runfcgi host=127.0.0.1 port=8000 pidfile=/data/seafile/seafile-server-6.0.6/runtime/seahub.pid outlog=/data/seafile/seafile-server-6.0.6/runtime/access.log errlog=/data/seafile/seafile-server-6.0.6/runtime/error.log
seafile   2568  0.1  1.4 204616 60220 ?        S    09:32   0:17 python2.7 /data/seafile/seafile-server-6.0.6/seahub/manage.py runfcgi host=127.0.0.1 port=8000 pidfile=/data/seafile/seafile-server-6.0.6/runtime/seahub.pid outlog=/data/seafile/seafile-server-6.0.6/runtime/access.log errlog=/data/seafile/seafile-server-6.0.6/runtime/error.log
postgres 28111  0.0  1.4 294408 59464 ?        Ss   08:21   0:02 postgres: seafile seafile_db 127.0.0.1(34422) idle 
caniwi commented 7 years ago

and this from pg_stat_activity...

datid |  datname   |  pid  | usesysid | usename | application_name | client_addr | client_hostname | client_port |         backend_start         |          xact_start           |          query_start          |         state_change          | waiting | state  |                                       query                                        
-------+------------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+------------------------------------------------------------------------------------
 21681 | ccnet_db   | 28003 |    21680 | seafile |                  | 127.0.0.1   |                 |       34390 | 2016-11-21 19:21:43.291357+00 |                               | 2016-11-21 23:56:28.689436+00 | 2016-11-21 23:56:28.689469+00 | f       | idle   | SELECT role FROM UserRole WHERE email=$1
 21682 | seafile_db | 28007 |    21680 | seafile |                  | 127.0.0.1   |                 |       34392 | 2016-11-21 19:21:44.454436+00 |                               | 2016-11-21 23:56:29.583555+00 | 2016-11-21 23:56:29.583605+00 | f       | idle   | SELECT commit_id FROM Branch WHERE name='master' AND repo_id=$1
 21682 | seafile_db | 28111 |    21680 | seafile |                  | 127.0.0.1   |                 |       34422 | 2016-11-21 19:21:53.864965+00 |                               | 2016-11-21 23:56:17.360019+00 | 2016-11-21 23:56:17.36006+00  | f       | idle   | SELECT commit_id FROM Branch WHERE name='master' AND repo_id=$1
 21682 | seafile_db | 28112 |    21680 | seafile |                  | 127.0.0.1   |                 |       34426 | 2016-11-21 19:21:53.876944+00 |                               | 2016-11-21 23:56:16.343734+00 | 2016-11-21 23:56:16.34378+00  | f       | idle   | SELECT commit_id FROM Branch WHERE name='master' AND repo_id=$1
 21682 | seafile_db | 28113 |    21680 | seafile |                  | 127.0.0.1   |                 |       34428 | 2016-11-21 19:21:53.889255+00 |                               | 2016-11-21 23:55:43.244232+00 | 2016-11-21 23:55:43.24427+00  | f       | idle   | SELECT commit_id FROM Branch WHERE name='master' AND repo_id=$1
 21682 | seafile_db |  1438 |    21680 | seafile |                  | 127.0.0.1   |                 |       37716 | 2016-11-21 20:20:16.220491+00 |                               | 2016-11-21 23:13:21.104937+00 | 2016-11-21 23:13:21.105041+00 | f       | idle   | SELECT 1 FROM Repo WHERE repo_id=$1
 21682 | seafile_db | 21409 |    21680 | seafile | psql             |             |                 |          -1 | 2016-11-21 23:55:41.524557+00 | 2016-11-21 23:56:30.113529+00 | 2016-11-21 23:56:30.113529+00 | 2016-11-21 23:56:30.113535+00 | f       | active | select * from pg_stat_activity;
 21681 | ccnet_db   | 29598 |    21680 | seafile |                  | 127.0.0.1   |                 |       34754 | 2016-11-21 19:26:07.858929+00 |                               | 2016-11-21 23:54:52.023143+00 | 2016-11-21 23:54:52.023178+00 | f       | idle   | SELECT role FROM UserRole WHERE email=$1
 21681 | ccnet_db   | 14064 |    21680 | seafile |                  | 127.0.0.1   |                 |       49308 | 2016-11-21 22:43:21.935942+00 |                               | 2016-11-21 23:02:26.963693+00 | 2016-11-21 23:02:26.963769+00 | f       | idle   | SELECT id, email, is_staff, is_active, ctime, passwd FROM EmailUser WHERE email=$1
(9 rows)

So seems to indicate this query as it matches the pid using the memory... SELECT commit_id FROM Branch WHERE name='master' AND repo_id=$1

david-barbion commented 7 years ago

Hi

I'm also impacted by this issue. I think it's related in the way the new wrapper handle PQclear. It should be called for all PQexec or PQexecPrepare (when finished processing data).

Looking at common/seaf-db.c, there is a db_connection_execute_query() but no result_set_free(). But there is a db_connection_close() which calls PQfinish()

This call (db_connection_execute_query()) is used in seaf_db_check_for_existence(), seaf_db_foreach_selected_row(), seaf_db_get_int(), seaf_db_get_int64() and seaf_db_get_string(). Every time, no PQclear() but a PQfinish()

I will try to make a patched version which will call PQclear every time needed.

Hope this helps.

cuihaikuo commented 7 years ago

Fixed in https://github.com/haiwen/seafile-server/pull/17

db_connection_close( ) will call db_connection_clear( ) first,and there is a PQclear( ) in it @david-barbion

Actually we need to explicitly deallocate the prepared statement in pgsql_db_stmt_free( )