rethinkdb / rethinkdb

The open-source database for the realtime web.
https://rethinkdb.com
Other
26.75k stars 1.86k forks source link

rethinkdb dump crashed server #5108

Closed springrider closed 8 years ago

springrider commented 8 years ago

restarted it and I got the report like this

report_fatal_error(char const*, int, char const*, ...) at 0xd45794 (/usr/bin/rethinkdb)\n3 [0xd3384d]: extproc_spawner_t::spawn(object_buffer_t<socket_stream_t>*, int*) at 0xd3384d (/usr/bin/rethinkdb)\n4 [0xd39b70]: extproc_worker_t::acquired(signal_t*) at 0xd39b70 (/usr/bin/rethinkdb)\n5 [0xd32547]: extproc_job_t::extproc_job_t(extproc_pool_t*, bool (*)(read_stream_t*, write_stream_t*), signal_t*) at 0xd32547 (/usr/bin/rethinkdb)\n6 [0xd38204]: http_runner_t::http(http_opts_t const&, http_result_t*, signal_t*) at 0xd38204 (/usr/bin/rethinkdb)\n7 [0x88b386]: ql::dispatch_http(ql::env_t*, http_opts_t const&, http_runner_t*, http_result_t*, ql::bt_rcheckable_t const*) at 0x88b386 (/usr/bin/rethinkdb)\n8 [0xc3b8c0]: version_checker_t::do_check(bool, auto_drainer_t::lock_t) at 0xc3b8c0 (/usr/bin/rethinkdb)\n9 [0xc3c486]: callable_action_instance_t<std::_Bind<std::_Mem_fn<void (version_checker_t::*)(bool, auto_drainer_t::lock_t)> (version_checker_t*, bool, auto_drainer_t::lock_t)> >::run_action() at 0xc3c486 (/usr/bin/rethinkdb)\n10 [0x9bf51e]: coro_t::run() at 0x9bf51e (/usr/bin/rethinkdb)

2015-11-08T11:22:26.530315728 1382400.547204s error: Exiting.
2015-11-08T11:27:26.731887427 0.035626s notice: Running rethinkdb 2.1.5+2~0precise (GCC 4.6.3)...
2015-11-08T11:27:26.734386880 0.038127s notice: Running on Linux 3.2.0-67-generic x86_64
2015-11-08T11:27:26.734427244 0.038165s notice: Loading data from directory xxx
2015-11-08T11:27:26.868758618 0.172497s info: Automatically using cache size of 100 MB
2015-11-08T11:27:26.871112803 0.174851s warn: Cache size does not leave much memory for server and query overhead (available memory: 729 MB).
2015-11-08T11:27:26.871118966 0.174857s warn: Cache size is very low and may impact performance.
2015-11-08T11:27:26.871621078 0.175359s notice: Listening for intracluster connections on port 29015

maybe it's caused by out of memory? what should I do? the machine has only 1GB RAM, should I increase the memory? everytime I tried to dump, it will crash the server. I got about 4million documents in one table.
maybe because of the data size? it was okay when I did not have so much data.

danielmewes commented 8 years ago

@springrider Can you try passing the --clients 1 parameter to rethinkdb dump? You might also want to upgrade to the latest release of the Python driver (sudo pip install --upgrade rethinkdb). It has some improvements to rethinkdb dump's memory consumption.

Could you also check the RethinkDB log file and see if there's another line above the report_fatal_error line?

Are you using r.http in your queries (just asking because the crash appeared to happen when RethinkDB tried to spawn a process to process an HTTP request)?

danielmewes commented 8 years ago

...actually --clients 1 might not help if you only have one table. Is that the case or do you also have other tables besides the 4 million doc one?

danielmewes commented 8 years ago

The backtrace with newlines:

report_fatal_error(char const*, int, char const*, ...) at 0xd45794 (/usr/bin/rethinkdb)
3 [0xd3384d]: extproc_spawner_t::spawn(object_buffer_t<socket_stream_t>*, int*) at 0xd3384d (/usr/bin/rethinkdb)
4 [0xd39b70]: extproc_worker_t::acquired(signal_t*) at 0xd39b70 (/usr/bin/rethinkdb)
5 [0xd32547]: extproc_job_t::extproc_job_t(extproc_pool_t*, bool (*)(read_stream_t*, write_stream_t*), signal_t*) at 0xd32547 (/usr/bin/rethinkdb)
6 [0xd38204]: http_runner_t::http(http_opts_t const&, http_result_t*, signal_t*) at 0xd38204 (/usr/bin/rethinkdb)
7 [0x88b386]: ql::dispatch_http(ql::env_t*, http_opts_t const&, http_runner_t*, http_result_t*, ql::bt_rcheckable_t const*) at 0x88b386 (/usr/bin/rethinkdb)
8 [0xc3b8c0]: version_checker_t::do_check(bool, auto_drainer_t::lock_t) at 0xc3b8c0 (/usr/bin/rethinkdb)
9 [0xc3c486]: callable_action_instance_t<std::_Bind<std::_Mem_fn<void (version_checker_t::*)(bool, auto_drainer_t::lock_t)> (version_checker_t*, bool, auto_drainer_t::lock_t)> >::run_action() at 0xc3c486 (/usr/bin/rethinkdb)
10 [0x9bf51e]: coro_t::run() at 0x9bf51e (/usr/bin/rethinkdb)
danielmewes commented 8 years ago

Also opened https://github.com/rethinkdb/rethinkdb/issues/5110 to make us survive failures to spawn an external helper process (we use those for r.http and for r.js). However this will not be part of the server before at least RethinkDB 2.3, so we should try to find another way until then.

springrider commented 8 years ago

thanks a lot! the --clients 1 option works. I do have more than one tables.