iv-org / inv_sig_helper

GNU Affero General Public License v3.0
19 stars 4 forks source link

inv_sig_helper sometimes deadlocks or crashes due to malloc corruption #12

Open delgh1 opened 1 month ago

delgh1 commented 1 month ago

After running for a few hours, it stops printing out logs and cause invidious "502 gateway timeout" when loading the video page. Loading things like trending, popular, thumbnails seems to be fine.

Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Decoder length: 0
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Signature to be decrypted: hvEiC7wLuPImbdjKN
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Decoder length: 24
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Received job: DecryptNSignature
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Decoder length: 0
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Signature to be decrypted: hvEiC7wLuPImbdjKN
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Decoder length: 24
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Received job: DecryptNSignature
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Decoder length: 0
Aug 12 08:17:01 invidious2 inv_sig_helper[65568]: Signature to be decrypted: hvEiC7wLuPImbdjKN
Aug 12 14:19:47 invidious2 systemd[1]: Stopping inv_sig_helper.service - inv_sig_helper...
Aug 12 14:19:47 invidious2 systemd[1]: inv_sig_helper.service: Deactivated successfully.
Aug 12 14:19:47 invidious2 systemd[1]: Stopped inv_sig_helper.service - inv_sig_helper.
Aug 12 14:19:47 invidious2 systemd[1]: inv_sig_helper.service: Consumed 4min 1.999s CPU time.
Aug 12 14:19:47 invidious2 systemd[1]: Started inv_sig_helper.service - inv_sig_helper.
Aug 12 14:19:47 invidious2 inv_sig_helper[69579]: Fetching player
Aug 12 14:19:48 invidious2 inv_sig_helper[69579]: Fetching player JS URL: https://www.youtube.com/s/player/28fd7348/player_ias.vflset/en_US/base.js
Aug 12 14:19:48 invidious2 inv_sig_helper[69579]: nsig function ending did not work: =\s*function(\([\w]+\)\{\s*var\s+[\w\s]+=[\w\.\s]+?\.call\s*\([\w\s$]+?,[\(\)\",\s]+\)[\S\s]*?\}\s*return [\w\.\s$]+?\.call\s*\([\w\s$]+?\s*,[\(\)\",\s]+\)\s*\}\s*;)
Aug 12 14:19:48 invidious2 inv_sig_helper[69579]: sig code: var $Ba;var OK={ZE:function(a,b){a.splice(0,b)},
Aug 12 14:19:48 invidious2 inv_sig_helper[69579]: Oy:function(a){a.reverse()},
Aug 12 14:19:48 invidious2 inv_sig_helper[69579]: fn:function(a,b){var c=a[0];a[0]=a[b%a.length];a[b%a.length]=c}};$Ba=function(a){a=a.split("");OK.Oy(a,33);OK.ZE(a,1);OK.fn(a,10);OK.fn(a,7);OK.Oy(a,37);OK.fn(a,43);return a.join("")}
Aug 12 14:19:48 invidious2 inv_sig_helper[69579]: Successfully fetched player
Aug 12 14:20:34 invidious2 inv_sig_helper[69579]: Decoder length: 5
Aug 12 14:20:34 invidious2 inv_sig_helper[69579]: Received job: GetSignatureTimestamp
Aug 12 14:20:34 invidious2 inv_sig_helper[69579]: Decoder length: 0
Aug 12 14:20:35 invidious2 inv_sig_helper[69579]: Decoder length: 108
Aug 12 14:20:35 invidious2 inv_sig_helper[69579]: Received job: DecryptSignature
Aug 12 14:20:35 invidious2 inv_sig_helper[69579]: Decoder length: 0
Aug 12 14:20:35 invidious2 inv_sig_helper[69579]: Decoder length: 24

If I grep "Started" in the log, there was also a restart before:

Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: Signature to be decrypted: AyU-XOU6YY4GjBTMp
Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: Signature to be decrypted: AyU-XOU6YY4GjBTMp
Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: Signature to be decrypted: AyU-XOU6YY4GjBTMp
Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: Signature to be decrypted: AyU-XOU6YY4GjBTMp
Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: Signature to be decrypted: AyU-XOU6YY4GjBTMp
Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: Signature to be decrypted: AyU-XOU6YY4GjBTMp
Aug 12 05:39:02 invidious2 inv_sig_helper[56683]: malloc(): smallbin double linked list corrupted
Aug 12 05:39:02 invidious2 systemd[1]: inv_sig_helper.service: Main process exited, code=killed, status=6/ABRT
Aug 12 05:39:02 invidious2 systemd[1]: inv_sig_helper.service: Failed with result 'signal'.
Aug 12 05:39:02 invidious2 systemd[1]: inv_sig_helper.service: Consumed 47min 2.388s CPU time.
Aug 12 05:39:04 invidious2 systemd[1]: inv_sig_helper.service: Scheduled restart job, restart counter is at 1.
Aug 12 05:39:04 invidious2 systemd[1]: Stopped inv_sig_helper.service - inv_sig_helper.
Aug 12 05:39:04 invidious2 systemd[1]: inv_sig_helper.service: Consumed 47min 2.388s CPU time.
Aug 12 05:39:04 invidious2 systemd[1]: Started inv_sig_helper.service - inv_sig_helper.
Aug 12 05:39:04 invidious2 inv_sig_helper[65568]: Fetching player
Aug 12 05:39:07 invidious2 inv_sig_helper[65568]: Fetching player JS URL: https://www.youtube.com/s/player/28fd7348/player_ias.vflset/en_US/base.js
Aug 12 05:39:09 invidious2 inv_sig_helper[65568]: nsig function ending did not work: =\s*function(\([\w]+\)\{\s*var\s+[\w\s]+=[\w\.\s]+?\.call\s*\([\w\s$]+?,[\(\)\",\s]+\)[\S\s]*?\}\s*return [\w\.\s$]+?\.call\s*\([\w\s$]+?\s*,[\(\)\",\s]+\)\s*\}\s*;)
Aug 12 05:39:09 invidious2 inv_sig_helper[65568]: sig code: var $Ba;var OK={ZE:function(a,b){a.splice(0,b)},
Aug 12 05:39:09 invidious2 inv_sig_helper[65568]: Oy:function(a){a.reverse()},
Aug 12 05:39:09 invidious2 inv_sig_helper[65568]: fn:function(a,b){var c=a[0];a[0]=a[b%a.length];a[b%a.length]=c}};$Ba=function(a){a=a.split("");OK.Oy(a,33);OK.ZE(a,1);OK.fn(a,10);OK.fn(a,7);OK.Oy(a,37);OK.fn(a,43);return a.join("")}
Aug 12 05:39:09 invidious2 inv_sig_helper[65568]: Successfully fetched player
Aug 12 05:40:45 invidious2 inv_sig_helper[65568]: Decoder length: 27
Aug 12 05:40:45 invidious2 inv_sig_helper[65568]: Received job: DecryptNSignature
Aug 12 05:40:45 invidious2 inv_sig_helper[65568]: Decoder length: 0
Aug 12 05:40:45 invidious2 inv_sig_helper[65568]: Signature to be decrypted: G1Q0HL-KEBSPJw8nsJzf
Aug 12 05:40:48 invidious2 inv_sig_helper[65568]: Decoder length: 5
Aug 12 05:40:48 invidious2 inv_sig_helper[65568]: Received job: GetSignatureTimestamp
Aug 12 05:40:48 invidious2 inv_sig_helper[65568]: Decoder length: 0
Aug 12 05:40:48 invidious2 inv_sig_helper[65568]: Decoder length: 24
Aug 12 05:40:48 invidious2 inv_sig_helper[65568]: Received job: DecryptNSignature

log1.gz log2.gz

It's built using rustc 1.80 and stripped:

$ file /usr/local/bin/inv_sig_helper
/usr/local/bin/inv_sig_helper: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=fa32acfabd0101754e57f2f4dbaf4ceafca115ac, for GNU/Linux 3.2.0, stripped

systemd unit /etc/systemd/system/inv_sig_helper.service:

[Unit]
Description=inv_sig_helper
After=syslog.target
After=network.target

[Service]
RestartSec=2s
Type=simple
User=invidious
Group=invidious
ExecStart=inv_sig_helper --tcp 127.0.0.1:8080
Restart=always

[Install]
WantedBy=multi-user.target

The setup is all manual in a LXC container, running 5 invidious processes behind nginx. invidious config:

visitor_data: secret
po_token: secret2
signature_server: 127.0.0.1:8080

nginx error log, just one line for example:

2024/08/12 14:19:35 [error] 63264#63264: *54914581 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 2400::, server: invidious.jing.rocks, request: "GET /watch?v=RsbDEyfC_UU&list=PL1329E2FD4B3D9A34&index=12 HTTP/1.1", upstream: "http://127.0.0.1:3003/watch?v=RsbDEyfC_UU&list=PL1329E2FD4B3D9A34&index=12", host: "invidious.jing.rocks"
techmetx11 commented 1 month ago

Thank you for stress-testing my program like this. I'll see how I can fix it

techmetx11 commented 1 month ago

I don't actually know what's causing either the deadlock or the malloc header corruption, there isn't much for me to go off of. Do you have a coredump from when the malloc header corruption happened at least? Assuming that the program is stripped, i'd guess it would be pretty unlikely

delgh1 commented 1 month ago

I don't actually know what's causing either the deadlock or the malloc header corruption, there isn't much for me to go off of. Do you have a coredump from when the malloc header corruption happened at least? Assuming that the program is stripped, i'd guess it would be pretty unlikely

Hmmm. I don't have a coredump (don't know how to get one). It's been ~6 hours since I restarted the service, and it's stable for now. Let's wait for now. If it happens again I will run the unstripped binary and see if I can get a coredump.

SamantazFox commented 1 month ago

@delgh1 can you run inv_sig_helper with RUST_BACKTRACE=full, so that we can hope to get a nice stack trace the next time it crashes?

delgh1 commented 1 month ago

RUST_BACKTRACE=full didn't provide anything useful... But here is the log and coredump

log1.gz log2.gz core.257225.tar.gz inv_sig_helper_rust_amd64.tar.gz

iamDonkey commented 4 weeks ago

I encountered similar situation after hours/several thousands of decryption , the TCP server become non responsive, I had to restart the server.

delgh1 commented 4 weeks ago

recaps from matrix: statically linked binaries seem to work without this issue.

build-deps on debian: libssl-dev, pkgconf, libzstd-dev, cargo ( >= 1.77 ) | rustup

RUSTFLAGS='-C target-feature=+crt-static -lz -lzstd' cargo build --release --target x86_64-unknown-linux-gnu

delgh1 commented 3 weeks ago

recaps from matrix: statically linked binaries seem to work without this issue.

Nope, spoke too soon. Using the static build can only reduce the frequency sig helper crashed.

delgh1 commented 3 weeks ago

Also observed a segfault on both amd64 and arm64, but no rust backtrace output even though RUST_BACKTRACE=full:

Aug 15 21:42:13 invidious1 inv_sig_helper[2684010]: Successfully fetched player
Aug 16 10:46:09 invidious1 systemd[1]: inv_sig_helper.service: Main process exited, code=killed, status=11/SEGV
Aug 16 10:46:09 invidious1 systemd[1]: inv_sig_helper.service: Failed with result 'signal'.
Aug 16 10:46:09 invidious1 systemd[1]: inv_sig_helper.service: Consumed 6min 10.884s CPU time.
Aug 16 10:46:11 invidious1 systemd[1]: inv_sig_helper.service: Scheduled restart job, restart counter is at 1.
Aug 16 10:46:11 invidious1 systemd[1]: Stopped inv_sig_helper.service - inv_sig_helper.

on arm64 it was killed twice by 11 SIGSEGV and once by 7 SIGBUS

delgh1 commented 3 weeks ago

new core dump:

x86_64: coredump1.tar.gz inv_sig_helper_x86_64_static_glibc.tar.gz

Note: when gdb detached the process, inv_sig_helper output these lines and the process died:

malloc_consolidate(): invalid chunk size
Aborted

aarch64: coredump2.tar.gz inv_sig_helper_aarch64_static_glibc.tar.gz

Note: errh, looking at the size of the binary, I think I built it with cargo --release when gdb detached the process, inv_sig_helper output these lines and the process died:

corrupted double-linked list (not small)
Aborted
delgh1 commented 2 weeks ago

aarch64: SIGABRT after two minutes

Aug 27 19:44:51 invidious5 inv_sig_helper[8968]: Successfully fetched player
Aug 27 19:46:43 invidious5 inv_sig_helper[8968]: malloc_consolidate(): unaligned fastbin chunk detected
Aug 27 19:46:43 invidious5 systemd[1]: inv_sig_helper.service: Main process exited, code=killed, status=6/ABRT
Aug 27 19:46:43 invidious5 systemd[1]: inv_sig_helper.service: Failed with result 'signal'.
Aug 27 19:46:43 invidious5 systemd[1]: inv_sig_helper.service: Consumed 18.524s CPU time.
delgh1 commented 1 week ago

per our discussion on matrix:

core.inv_sig_helper.100108.641bc88e05c44dbdade085dd786b3abc.446859.1725370540000000.reallyzst.gz

Remember to change the file extension to .zst because this crappy site doesn't allow zst files