vatesfr / xen-orchestra

The global orchestration solution to manage and backup XCP-ng and XenServer.
https://xen-orchestra.com
Other
750 stars 261 forks source link

Lot of xo:perf leading to XO crash #575

Closed olivierlambert closed 8 years ago

olivierlambert commented 8 years ago

After some times or an event (xo.getAllObjects?), xo-server seem to use all the CPU and also the memory:

Dec 10 16:28:58 xoa xo-server[18308]: xo:api foobaruser | xo.getAllObjects(...) [11ms] ==> object
Dec 10 16:28:58 xoa xo-server[18308]: xen-api root@server: event.from(...) [172ms] ==> object
Dec 10 16:28:58 xoa xo-server[18308]: xo:perf blocked for 15ms
Dec 10 16:28:58 xoa xo-server[18308]: xen-api root@server: event.from(...) [99ms] ==> object
Dec 10 16:28:59 xoa xo-server[18308]: xo:perf blocked for 60ms
Dec 10 16:28:59 xoa xo-server[18308]: xo:perf blocked for 59ms
Dec 10 16:28:59 xoa xo-server[18308]: xo:perf blocked for 41ms
Dec 10 16:28:59 xoa xo-server[18308]: xo:perf blocked for 76ms
Dec 10 16:28:59 xoa xo-server[18308]: xo:perf blocked for 10ms
Dec 10 16:28:59 xoa xo-server[18308]: xo:perf blocked for 104ms

Until it crashes:

Dec 10 16:27:27 xoa xo-server[16406]: xo:perf blocked for 98548ms
Dec 10 16:28:38 xoa xo-server[16406]: FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Dec 10 16:28:38 xoa xo-server[16406]: <--- Last few GCs --->
Dec 10 16:28:38 xoa xo-server[16406]: 2903447 ms: Mark-sweep 1367.9 (1456.2) -> 1373.7 (1456.2) MB, 2700.3 / 0 ms [allocation failure] [$
Dec 10 16:28:38 xoa xo-server[16406]: 2906084 ms: Mark-sweep 1373.7 (1456.2) -> 1374.8 (1456.2) MB, 2636.6 / 0 ms [allocation failure] [$
Dec 10 16:28:38 xoa xo-server[16406]: 2908759 ms: Mark-sweep 1374.8 (1456.2) -> 1365.9 (1456.2) MB, 2674.8 / 0 ms [last resort gc].
Dec 10 16:28:38 xoa xo-server[16406]: 2911434 ms: Mark-sweep 1365.9 (1456.2) -> 1366.9 (1456.2) MB, 2674.9 / 0 ms [last resort gc].
Dec 10 16:28:38 xoa xo-server[16406]: <--- JS stacktrace --->
Dec 10 16:28:38 xoa xo-server[16406]: ==== JS stack trace =========================================
Dec 10 16:28:38 xoa xo-server[16406]: Security context: 0x35e6aae37399 <JS Object>
Dec 10 16:28:38 xoa xo-server[16406]: 1: nextTick [/usr/local/lib/node_modules/xo-server/node_modules/trace/async_wrap.js:37] [pc=0x2ff4$
Dec 10 16:28:38 xoa xo-server[16406]: 2: arguments adaptor frame: 1->0
Dec 10 16:28:38 xoa xo-server[16406]: 3: maybeReadMore [/usr/local/lib/node_modules/xo-server/node_modules/level-sublevel/node_modules/r$
Dec 10 16:28:38 xoa systemd[1]: xo-server.service: main process exited, code=killed, status=6/ABRT
Dec 10 16:28:38 xoa systemd[1]: Unit xo-server.service entered failed state.
Dec 10 16:28:38 xoa systemd[1]: xo-server.service holdoff time over, scheduling restart.

Possible leads:

olivierlambert commented 8 years ago

No more stories heard about this. Closing it for now.

gerard-kanters commented 5 years ago

I am experiencing the same issue.

Jan  6 11:59:42 xen-orchestra xo-server[728]: 2019-01-06T10:59:42.779Z - xo:perf - [INFO] blocked for 1823ms
Jan  6 11:59:45 xen-orchestra xo-server[728]: 2019-01-06T10:59:45.699Z - xo:perf - [INFO] blocked for 2421ms
Jan  6 11:59:48 xen-orchestra xo-server[728]: 2019-01-06T10:59:48.130Z - xo:perf - [INFO] blocked for 2130ms
Jan  6 11:59:50 xen-orchestra xo-server[728]: 2019-01-06T10:59:50.853Z - xo:perf - [INFO] blocked for 2012ms
Jan  6 11:59:52 xen-orchestra xo-server[728]: 2019-01-06T10:59:52.327Z - xo:perf - [INFO] blocked for 1175ms

Until the server runs out of memory. Adding more memory to the VM makes the process stay up longer but I cannot even backup a 50GB VM with 32GB memory for the xo-server.

I've built the xo-server from source and is fully updated.

gerard-kanters commented 5 years ago

While this is still an issue, I have solved this by creating an NFS mount to the XO-SERVER and add this mount as remotes (settings -> remotes)

That works very fast and is a very acceptable workaround. I still hope the project will solve the issue with SMB performance, if not please remove the feature and document how to make fast backups using NFS.

Using SMB mounts works for Linux systems, but Windows systems seems to create another unclear error (write error at the end of the backup sequence).

A local NFS mount seems to be the only way to use XO-SERVER for remote backups.

On ubuntu you need to install NFS support files

apt-get install nfs-common

And mount the NFS share e.g:

mount -t nfs {hostname}:/Snapshots /mnt/nfs

julien-f commented 5 years ago

Hi @gerard-kanters, indeed we have some (more) SMB issues since the last release, we are working on them. Note that starting from the now, XO will use mount.cifs if present on your system for better perf and stability (similar to NFS) :slightly_smiling_face: