jlesage / docker-crashplan-pro

Docker container for CrashPlan PRO (aka CrashPlan for Small Business)
MIT License
297 stars 38 forks source link

2.16.0 worked - 2.16.1 is failing #341

Closed dav1n closed 2 years ago

dav1n commented 2 years ago

I just upgraded docker-crashplan-pro on my QNAP from 2.16.0 (which was running for few several days) to 2.16.1 - and now it's not working.

The processes do start, and are using resources on the QNAP. But connecting to the admin interface on port 5800, times out and gives "Code42 cannot connect to its background service". And on the Code42 website, it shows that this device has now not connected since I did the upgrade.

jlesage commented 2 years ago

What do you have in the container's log (docker logs <container name>) ?

dav1n commented 2 years ago

[s6-init] making user provided files available at /var/run/s6/etc...exited 0. [s6-init] ensuring user provided files have correct perms...exited 0. [fix-attrs.d] applying ownership & permissions fixes... [fix-attrs.d] done. [cont-init.d] executing container initialization scripts... [cont-init.d] 00-app-niceness.sh: executing... [cont-init.d] 00-app-niceness.sh: exited 0. [cont-init.d] 00-app-script.sh: executing... [cont-init.d] 00-app-script.sh: exited 0. [cont-init.d] 00-app-user-map.sh: executing... [cont-init.d] 00-app-user-map.sh: exited 0. [cont-init.d] 00-clean-logmonitor-states.sh: executing... [cont-init.d] 00-clean-logmonitor-states.sh: exited 0. [cont-init.d] 00-clean-tmp-dir.sh: executing... [cont-init.d] 00-clean-tmp-dir.sh: exited 0. [cont-init.d] 00-set-app-deps.sh: executing... [cont-init.d] 00-set-app-deps.sh: exited 0. [cont-init.d] 00-set-home.sh: executing... [cont-init.d] 00-set-home.sh: exited 0. [cont-init.d] 00-take-config-ownership.sh: executing... [cont-init.d] 00-take-config-ownership.sh: exited 0. [cont-init.d] 00-xdg-runtime-dir.sh: executing... [cont-init.d] 00-xdg-runtime-dir.sh: exited 0. [cont-init.d] 10-certs.sh: executing... [cont-init.d] 10-certs.sh: exited 0. [cont-init.d] 10-cjk-font.sh: executing... [cont-init.d] 10-cjk-font.sh: exited 0. [cont-init.d] 10-nginx.sh: executing... [cont-init.d] 10-nginx.sh: exited 0. [cont-init.d] 10-vnc-password.sh: executing... [cont-init.d] 10-vnc-password.sh: exited 0. [cont-init.d] 10-web-index.sh: executing... [cont-init.d] 10-web-index.sh: exited 0. [cont-init.d] crashplan-pro.sh: executing... [cont-init.d] crashplan-pro.sh: setting CrashPlan Engine maximum memory to 8192m [cont-init.d] crashplan-pro.sh: exited 0. [cont-init.d] validate_max_mem.sh: executing... [cont-init.d] validate_max_mem.sh: exited 0. [cont-init.d] done. [services.d] starting services [services.d] starting s6-fdholderd... [services.d] starting CrashPlanEngine... [CrashPlanEngine] starting... [services.d] starting certsmonitor... [services.d] starting nginx... [certsmonitor] starting... [nginx] starting... [services.d] starting xvfb... [xvfb] starting... The XKEYBOARD keymap compiler (xkbcomp) reports:

Internal error: Could not resolve keysym XF86FullScreen Errors from xkbcomp are not fatal to the X server [services.d] starting logmonitor... [services.d] starting openbox... [logmonitor] starting... [openbox] starting... The XKEYBOARD keymap compiler (xkbcomp) reports: Internal error: Could not resolve keysym XF86FullScreen Errors from xkbcomp are not fatal to the X server [services.d] starting x11vnc... [services.d] starting statusmonitor... [x11vnc] starting... 21/12/2021 21:25:15 passing arg to libvncserver: -rfbport 21/12/2021 21:25:15 passing arg to libvncserver: 5900 21/12/2021 21:25:15 passing arg to libvncserver: -rfbportv6 21/12/2021 21:25:15 passing arg to libvncserver: -1 21/12/2021 21:25:15 passing arg to libvncserver: -httpportv6 21/12/2021 21:25:15 passing arg to libvncserver: -1 21/12/2021 21:25:15 passing arg to libvncserver: -desktop 21/12/2021 21:25:15 passing arg to libvncserver: CrashPlan for Small Business 21/12/2021 21:25:15 Setting -localhost in -stunnel mode. 21/12/2021 21:25:15 x11vnc version: 0.9.14 lastmod: 2015-11-14 pid: 5386 21/12/2021 21:25:15 Using X display :0 21/12/2021 21:25:15 rootwin: 0x43 reswin: 0x400001 dpy: 0x17676b40 21/12/2021 21:25:15 21/12/2021 21:25:15 ------------------ USEFUL INFORMATION ------------------ [services.d] starting app... [statusmonitor] starting... 21/12/2021 21:25:15 X DAMAGE available on display, using it for polling hints. 21/12/2021 21:25:15 To disable this behavior use: '-noxdamage' 21/12/2021 21:25:15 21/12/2021 21:25:15 Most compositing window managers like 'compiz' or 'beryl' 21/12/2021 21:25:15 cause X DAMAGE to fail, and so you may not see any screen 21/12/2021 21:25:15 updates via VNC. Either disable 'compiz' (recommended) or 21/12/2021 21:25:15 supply the x11vnc '-noxdamage' command line option. 21/12/2021 21:25:15 X COMPOSITE available on display, using it for window polling. 21/12/2021 21:25:15 To disable this behavior use: '-noxcomposite' 21/12/2021 21:25:15 21/12/2021 21:25:15 Wireframing: -wireframe mode is in effect for window moves. 21/12/2021 21:25:15 If this yields undesired behavior (poor response, painting 21/12/2021 21:25:15 errors, etc) it may be disabled: 21/12/2021 21:25:15 - use '-nowf' to disable wireframing completely. 21/12/2021 21:25:15 - use '-nowcr' to disable the Copy Rectangle after the 21/12/2021 21:25:15 moved window is released in the new position. 21/12/2021 21:25:15 Also see the -help entry for tuning parameters. 21/12/2021 21:25:15 You can press 3 Alt_L's (Left "Alt" key) in a row to 21/12/2021 21:25:15 repaint the screen, also see the -fixscreen option for 21/12/2021 21:25:15 periodic repaints. 21/12/2021 21:25:15 GrabServer control via XTEST. [app] starting CrashPlan for Small Business... 21/12/2021 21:25:15 21/12/2021 21:25:15 Scroll Detection: -scrollcopyrect mode is in effect to 21/12/2021 21:25:15 use RECORD extension to try to detect scrolling windows 21/12/2021 21:25:15 (induced by either user keystroke or mouse input). 21/12/2021 21:25:15 If this yields undesired behavior (poor response, painting 21/12/2021 21:25:15 errors, etc) it may be disabled via: '-noscr' 21/12/2021 21:25:15 Also see the -help entry for tuning parameters. 21/12/2021 21:25:15 You can press 3 Alt_L's (Left "Alt" key) in a row to 21/12/2021 21:25:15 repaint the screen, also see the -fixscreen option for 21/12/2021 21:25:15 periodic repaints. 21/12/2021 21:25:15 21/12/2021 21:25:15 XKEYBOARD: number of keysyms per keycode 7 is greater 21/12/2021 21:25:15 than 4 and 51 keysyms are mapped above 4. 21/12/2021 21:25:15 Automatically switching to -xkb mode. 21/12/2021 21:25:15 If this makes the key mapping worse you can 21/12/2021 21:25:15 disable it with the "-noxkb" option. 21/12/2021 21:25:15 Also, remember "-remap DEAD" for accenting characters. 21/12/2021 21:25:15 21/12/2021 21:25:15 X FBPM extension not supported. Xlib: extension "DPMS" missing on display ":0". 21/12/2021 21:25:15 X display is not capable of DPMS. 21/12/2021 21:25:15 -------------------------------------------------------- 21/12/2021 21:25:15 21/12/2021 21:25:15 Default visual ID: 0x21 21/12/2021 21:25:15 Read initial data from X display into framebuffer. 21/12/2021 21:25:15 initialize_screen: fb_depth/fb_bpp/fb_Bpl 24/32/5120 21/12/2021 21:25:15 21/12/2021 21:25:15 starting ssl tunnel: /usr/bin/stunnel 5900 -> 5950 2021.12.21 21:25:15 LOG5[ui]: stunnel 5.44 on x86_64-pc-linux-gnu platform 2021.12.21 21:25:15 LOG5[ui]: Compiled/running with OpenSSL 1.1.1g 21 Apr 2020 2021.12.21 21:25:15 LOG5[ui]: Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,FIPS,OCSP,PSK,SNI 2021.12.21 21:25:15 LOG5[ui]: Reading configuration from descriptor 3 2021.12.21 21:25:15 LOG5[ui]: UTF-8 byte order mark not detected 2021.12.21 21:25:15 LOG5[ui]: FIPS mode disabled 2021.12.21 21:25:15 LOG5[ui]: Configuration successful [services.d] done. 21/12/2021 21:25:16 stunnel pid is: 5481 21/12/2021 21:25:16 21/12/2021 21:25:16 X display :0 is 32bpp depth=24 true color 21/12/2021 21:25:16 21/12/2021 21:25:16 Listening for VNC connections on TCP port 5950 21/12/2021 21:25:16 21/12/2021 21:25:16 Xinerama is present and active (e.g. multi-head). 21/12/2021 21:25:16 Xinerama: number of sub-screens: 1 21/12/2021 21:25:16 Xinerama: no blackouts needed (only one sub-screen) 21/12/2021 21:25:16 21/12/2021 21:25:16 fb read rate: 754 MB/sec 21/12/2021 21:25:16 fast read: reset -wait ms to: 10 21/12/2021 21:25:16 fast read: reset -defer ms to: 10 21/12/2021 21:25:16 The X server says there are 10 mouse buttons. 21/12/2021 21:25:16 screen setup finished. 21/12/2021 21:25:16

The VNC desktop is: localhost:50

The SSL VNC desktop is: e0772760419d:0 PORT=5950 SSLPORT=5900


Have you tried the x11vnc '-ncache' VNC client-side pixel caching feature yet?

The scheme stores pixel data offscreen on the VNC viewer side for faster retrieval. It should work with any VNC viewer. Try it by running:

x11vnc -ncache 10 ...

One can also add -ncache_cr for smooth 'copyrect' window motion. More info: http://www.karlrunge.com/x11vnc/faq.html#faq-client-caching

[CrashPlanEngine] starting... [CrashPlanEngine] starting... [CrashPlanEngine] starting... [CrashPlanEngine] starting...


But it also outputs to stderr:

docker logs crashplan-pro > logs-debug

s6-svwait: fatal: timed out #

On Tue, Dec 21, 2021 at 7:53 PM Jocelyn Le Sage @.***> wrote:

What do you have in the container's log (docker logs ) ?

— Reply to this email directly, view it on GitHub https://github.com/jlesage/docker-crashplan-pro/issues/341#issuecomment-999197471, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP4ZFY5KX25I3I5AAT2XYW3USEOSLANCNFSM5KQTLVGA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

dav1n commented 2 years ago

I rebooted the QNAP, in case this was somehow a transient issue (which it should not be able to be, because it's a container) - and it made no difference. Same breakage and same log output.

dav1n commented 2 years ago

Any suggestions? I hate leaving my QNAP not backing up.

Alternatively, since I'm fairly docker-naive, is there a way to force it to re-pull 2.16.0, so I can just run that version for now?

Thanks!!

charlesdunbar commented 2 years ago

I'm running into this too, but with a TrueNAS virtual machine running rancherOS (old kernel fun, 4.14.73-rancher is what I have).

Biggest thing I'm seeing is in engine_error.log:

terminate called after throwing an instance of 'boost::filesystem::filesystem_error'
  what():  boost::filesystem::create_directory: File exists: "/usr/local/crashplan/log/native/logs"

Deleting the native folder or changing its permissions don't help, and there may be no real good solution besides somehow updating - https://github.com/boostorg/filesystem/issues/172 and https://github.com/boostorg/filesystem/issues/173

jlesage commented 2 years ago

@dav1n, can you look at engine_error.log (located at log/engine_error.log under the directory you mapped to /config) to see if you have the same error messages?

jlesage commented 2 years ago

Can you try to manually create the directory to see if it helps ?

docker exec <container name> mkdir -p /config/log/native/logs
dav1n commented 2 years ago

Here's what I see in engine_error.log:

terminate called after throwing an instance of 'boost::filesystem::filesystem_error' what(): boost::filesystem::create_directory: File exists: "/usr/local/crashplan/log/native/logs" [main] INFO org.reflections.Reflections - Reflections took 1535 ms to scan 40 urls, producing 1917 keys and 12072 values WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by com.code42.crypto.jce.ec.EcCurveLookup (file:/usr/local/crashplan/lib/c42-crypto-impl-15.2.3.jar) to method sun.security.util.CurveDB.lookup(java.security.spec.ECParameterSpec) WARNING: Please consider reporting this to the maintainers of com.code42.crypto.jce.ec.EcCurveLookup WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release

jlesage commented 2 years ago

If you do a tail -f /config/log/engine_error.log, you should see recurring messages. Do you see that the boost error comes last ? If not, it may not be the reason why the engine is not starting. Also, note that the 2.16.0 version has the same boost library version.

dav1n commented 2 years ago

tail -f seems to loop that same chunk as I posted:

... terminate called after throwing an instance of 'boost::filesystem::filesystem_error' what(): boost::filesystem::create_directory: File exists: "/usr/local/crashplan/log/native/logs" [main] INFO org.reflections.Reflections - Reflections took 1567 ms to scan 40 urls, producing 1917 keys and 12072 values WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by com.code42.crypto.jce.ec.EcCurveLookup (file:/usr/local/crashplan/lib/c42-crypto-impl-15.2.3.jar) to method sun.security.util.CurveDB.lookup(java.security.spec.ECParameterSpec) WARNING: Please consider reporting this to the maintainers of com.code42.crypto.jce.ec.EcCurveLookup WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release terminate called after throwing an instance of 'boost::filesystem::filesystem_error' what(): boost::filesystem::create_directory: File exists: "/usr/local/crashplan/log/native/logs" [main] INFO org.reflections.Reflections - Reflections took 1480 ms to scan 40 urls, producing 1917 keys and 12072 values WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by com.code42.crypto.jce.ec.EcCurveLookup (file:/usr/local/crashplan/lib/c42-crypto-impl-15.2.3.jar) to method sun.security.util.CurveDB.lookup(java.security.spec.ECParameterSpec) WARNING: Please consider reporting this to the maintainers of com.code42.crypto.jce.ec.EcCurveLookup WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release terminate called after throwing an instance of 'boost::filesystem::filesystem_error' what(): boost::filesystem::create_directory: File exists: "/usr/local/crashplan/log/native/logs" ...

dav1n commented 2 years ago

And I did try that docker exec mkdir, and it made no difference.

jlesage commented 2 years ago

Yes, tail -f will periodically display the same block of messages, but what is the last message before you get a "pause" in the output?

Also, please provide the output of:

docker exec <container name> /config/log/native/
dav1n commented 2 years ago

Ok.... so what I did was to mv the existing logs directory aside. (Because the error was not that it could not create the dir, but that the dir already existed.) And I did it directly in the filesystem, and not via docker exec:

So in /share/CACHEDEV1_DATA/Container/appdata/crashplan-pro/config/log/native I did: mv logs logs-old

And then the errors stopped, and the engine must have started, because Crashplan is now running. And it did, as expected, create a new logs directory, and inside it is: [/share/CACHEDEV1_DATA/Container/appdata/crashplan-pro/config/log/native/logs] # cat 20211223_00.log [2021-12-23 11:17:30.990590] [00:00:05.238263] [INFO ] (UnknownModule) Initializing uaw module, rootInstall=true [2021-12-23 11:17:31.014027] [00:00:05.261702] [INFO ] (UnknownModule) Initial Configuration:device_uaw_inactiveDurationInSec=608s. [2021-12-23 11:17:31.063791] [00:00:05.311465] [INFO ] (UnknownModule) Configuration:device_uaw_inactiveDurationInSec is now 60s [2021-12-23 11:17:31.064292] [00:00:05.312035] [INFO ] (uaw) Starting the uaw module. [2021-12-23 11:17:31.064623] [00:00:05.312294] [WARN ] (uaw) Unable to watch user activity. filesystem error: directory iterator cannot open directory: No such file or directory [/dev/input/by-path] [2021-12-23 11:17:31.079938] [00:00:05.327612] [ERROR] (uaw) Stopping the uaw module early, failed to start the watcher. (In case that matters/helps)

However, I expect/fear that next time this restarts, it's going to fail in the same way, until I again mv that directory aside (or rm it).

dav1n commented 2 years ago

(So it seems that boost::filesystem::create_directory is missing the equivalent of a "-p" on its call to mkdir)

jlesage commented 2 years ago

Or it may be that there was a file at the location of the directory.

Do you still have your old log directory? Can you do a ls -Ral on it ?

dav1n commented 2 years ago

Here is the full listing from "native" down:

ls -lRa

.: total 16 drwxr-xr-x 4 admin administrators 4096 2021-12-23 11:17 ./ drwxr-xr-x 4 admin administrators 4096 2021-12-23 11:07 ../ drwxr-xr-x 2 admin administrators 4096 2021-12-23 11:17 logs/ drwxr-xr-x 2 admin administrators 4096 2021-12-18 12:34 logs-old/

./logs: total 12 drwxr-xr-x 2 admin administrators 4096 2021-12-23 11:17 ./ drwxr-xr-x 4 admin administrators 4096 2021-12-23 11:17 ../ -rw-r--r-- 1 admin administrators 780 2021-12-23 11:17 20211223_00.log

./logs-old: total 12 drwxr-xr-x 2 admin administrators 4096 2021-12-18 12:34 ./ drwxr-xr-x 4 admin administrators 4096 2021-12-23 11:17 ../ -rw-r--r-- 1 admin administrators 1006 2021-12-18 12:34 20211218_00.log

I'll note again that the logged failure was on the mkdir, not on the file creation.

jlesage commented 2 years ago

I wanted to verify that /usr/local/crashplan/log/native/logs was not existing as a file. Because if it was the case, performing mkdir on it would produce the same error.

Can you try to restart the container to see if you will produce the same error ?

dav1n commented 2 years ago

Destroyed and restarted the container. And it is again erroring in the same way: what(): boost::filesystem::create_directory: File exists: "/usr/local/crashplan/log/native/logs"

So I will manually mv that dir aside again, so my backups can run.

Thanks!!

jlesage commented 2 years ago

Ok, I will implement a workaround (i.e. automatically remove /usr/local/crashplan/log/native/logs before starting the engine).

charlesdunbar commented 2 years ago

Awesome! I didn't try moving the whole thing 😂

Can confirm I'm also able to get a login prompt after doing the following (from the /path/to/crashplan/log/native folder):

$ mv logs logs-old

With the container still running, it immediately made a new logs folder and I was able to get a prompt.

jlesage commented 2 years ago

I implemented the workaround in the latest Docker image. Let me know if you still see this issue.

dav1n commented 2 years ago

Thanks! I pulled the latest version, 2.16.3, and the engine now starts correctly, without me needing to mv that dir aside.