hoehermann / purple-gowhatsapp

Pidgin/libpurple plugin for WhatsApp Web.
GNU General Public License v3.0
276 stars 34 forks source link

Frequent re-pairing needed when using spectrum2 (was: SIGSEGV after "503 Stream error") #160

Closed heeplr closed 1 year ago

heeplr commented 1 year ago

With revision 60b73987 I get these

[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x7f8b94b238]

goroutine 17 [running, locked to thread]:
go.mau.fi/whatsmeow.(*Client).SubscribePresence(0x4000298800, {{0x4000f448e0, 0xc}, 0x0, 0x0, {0x4000f448ed, 0xe}, 0x0})
        /home/user/purple-gowhatsapp/build/src/go/go/pkg/mod/go.mau.fi/whatsmeow@v0.0.0-20230414083403-821b3b80925a/presence.go:91 +0x58
main.(*Handler).subscribe_presence(0x4000318540, {0x4000f448e0, 0x1b})
        /home/user/purple-gowhatsapp/src/go/presence.go:63 +0x168
main.gowhatsapp_go_subscribe_presence(0x55a91bdd01?, 0x4000002680?)
        /home/user/purple-gowhatsapp/src/go/bridge.go:136 +0x6c
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x7f8f66b238]

goroutine 17 [running, locked to thread]:
go.mau.fi/whatsmeow.(*Client).SubscribePresence(0x400023a800, {{0x40000e06e0, 0xc}, 0x0, 0x0, {0x40000e06ed, 0xe}, 0x0})
        /home/user/purple-gowhatsapp/build/src/go/go/pkg/mod/go.mau.fi/whatsmeow@v0.0.0-20230414083403-821b3b80925a/presence.go:91 +0x58
main.(*Handler).subscribe_presence(0x40002a4e70, {0x40000e06e0, 0x1b})
        /home/user/purple-gowhatsapp/src/go/presence.go:63 +0x168
main.gowhatsapp_go_subscribe_presence(0x55c09d3d01?, 0x4000002680?)
        /home/user/purple-gowhatsapp/src/go/bridge.go:136 +0x6c

After every

May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,278 INFO  libpurple: whatsmeow:[Client/Recv] <stream:error code="503"/>
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,278 INFO  libpurple: whatsmeow:[Client/Recv] <xmlstreamend/>
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,278 INFO  libpurple: whatsmeow:[Client] Received stream end frame
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,278 INFO  libpurple: whatsmeow:[Client] Got 503 stream error, assuming automatic reconnect will handle it
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: whatsmeow:recieved error (subtype 0) for account 0x55c09d3d90 remote (null) (isGroup 0) sender (null) (alias (null), isOutgoing 0) sent 0: [Client/Socket] Error reading from websocket: websocket: close 1006 (abnormal closure): unexpected EOF
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: connection:Connection error on 0x55c09fb970 (reason: 0 description: [Client/Socket] Error reading from websocket: websocket: close 1006 (abnormal closure): unexpected EOF
)
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: whatsmeow:[Client/Socket] Frame websocket read pump exiting 0x4000344580
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: whatsmeow:[Client] Emitting Disconnected event
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: whatsmeow:[Client] Automatically reconnecting after 0s
May  9 22:54:09 server spectrum2[27433]: 27433: 2023-05-09 22:54:09,483 INFO  User: user@domain.com: Disconnecting from legacy network [Client/Socket] Error reading from websocket: websocket: close 1006 (abnormal closure): unexpected EOF
, trying to reconnect automatically.
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: whatsmeow:[Client/Socket] Dialing wss://web.whatsapp.com/ws/chat
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: whatsmeow:[Handler] Event type not handled: &events.Disconnected{}
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: account:Disconnecting account 4912345678@s.whatsapp.net (0x55c09d3d90)
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: connection:Disconnecting connection 0x55c09fb970
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,483 INFO  libpurple: connection:Destroying connection 0x55c09fb970
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,484 ERROR backend: g_log purple_account_disconnect: assertion '!purple_account_is_disconnected(account)' failed
May  9 22:54:09 server spectrum2_libpurple_backend[29619]: 29619: 2023-05-09 22:54:09,484 INFO  libpurple: whatsmeow:[Handler/Profile] WhatsApp session disconnected. Profile picture downloader is shutting down.
hoehermann commented 1 year ago

Thank you for the report. Just yesterday, I encountered a segmentation fault during disconnect. Might be the same issue. It looks like a race-condition where Spectrum, purple and whatsmeow are not fully agreeing on whether the connection is still active or not. Also, I should probably handle that &events.Disconnected{}. I hope I can look into it this Sunday.

hoehermann commented 1 year ago

A bit of research shows that there have been changes related to 503 errors in https://github.com/tulir/whatsmeow/commit/ec411f5. I prepared a190d69, but I had no time to test it, yet.

heeplr commented 1 year ago

Testing dev branch now.

heeplr commented 1 year ago

SIGSEGV and 503 errors are gone, but disconnects still happen 2-3x per day so I have to link again by scanning the QR code:

May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,927 INFO  libpurple: whatsmeow:recieved error (subtype 1) for account 0x557b094d80 remote (null) (isGroup 0) sender (null) (alias (null), isOutgoing 0) sent 0: [Client/Socket] Error reading from websocket: websocket: close 1006 (abnormal closure): unexpected EOF
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,927 INFO  libpurple: connection:Connection error on 0x557b11f5f0 (reason: 16 description: [Client/Socket] Error reading from websocket: websocket: close 1006 (abnormal closure): unexpected EOF)
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,927 INFO  libpurple: whatsmeow:[Client/Socket] Frame websocket read pump exiting 0x4001262c60
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: whatsmeow:[Client] Emitting Disconnected event
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: whatsmeow:recieved error (subtype 0) for account 0x557b094d80 remote (null) (isGroup 0) sender (null) (alias (null), isOutgoing 0) sent 0: Disconnected.
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: account:Disconnecting account 0123456789@s.whatsapp.net (0x557b094d80)
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: connection:Disconnecting connection 0x557b11f5f0
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: connection:Destroying connection 0x557b11f5f0
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: whatsmeow:[Handler/Profile] WhatsApp session disconnected. Profile picture downloader is shutting down.
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: whatsmeow:[Client] Automatically reconnecting after 0s
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,928 INFO  libpurple: whatsmeow:[Client/Socket] Dialing wss://web.whatsapp.com/ws/chat
May 15 20:16:46 server spectrum2[7214]: 7214: 2023-05-15 20:16:46,929 INFO  User: user@example.com: Disconnecting from legacy network [Client/Socket] Error reading from websocket: websocket: close 1006 (abnormal closure): unexpected EOF, trying to reconnect automatically.
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,932 ERROR backend: g_log purple_account_disconnect: assertion '!purple_account_is_disconnected(account)' failed
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,975 INFO  backend: Using previously created account with name '0123456789@s.whatsapp.net' and protocol 'prpl-hehoe-whatsmeow'
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,975 INFO  libpurple: account:Connecting to account 0123456789@s.whatsapp.net.
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,975 INFO  libpurple: connection:Connecting. gc = 0x557b14bdc0
May 15 20:16:46 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:46,975 INFO  libpurple: whatsmeow:No proxy set in purple. The go runtime might pick up the https_proxy environment variable regardless.
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,240 INFO  libpurple: whatsmeow:[Database] sqlite3 connecting to file:/var/lib/spectrum2/whatsuck.example.com/whatsmeow.db?_foreign_keys=on&_busy_timeout=3000
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,240 INFO  libpurple: whatsmeow:[Client/Socket] Dialing wss://web.whatsapp.com/ws/chat
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,240 INFO  libpurple: whatsmeow:[Client/Socket] Frame websocket read pump starting 0x400020c9a0
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,240 INFO  libpurple: whatsmeow:[Client/Socket] Frame websocket read pump starting 0x4001327130
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,437 INFO  libpurple: whatsmeow:[Client/Recv] <iq from="s.whatsapp.net" id="585916416" type="set" xmlns="md"><pair-device><ref>...</ref><ref>...</ref><ref>...</ref><ref>...</ref><ref>...</ref><ref>...</ref></pair-device></iq>
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,437 INFO  libpurple: whatsmeow:[Client/Send] <iq id="585916416" to="s.whatsapp.net" type="result"/>
May 15 20:16:47 server spectrum2_libpurple_backend[7872]: 7872: 2023-05-15 20:16:47,469 INFO  libpurple: whatsmeow:recieved login (subtype 0) for account 0xffffffffff remote (null) (isGroup 0) sender (null) (alias Scan this code to log in:<br/>...<br/>

That renders it unusable for me but I suppose it's not related to purple-gowhatsapp and most probably not to this issue, so closing this.

hoehermann commented 1 year ago

Disconnects happen to me, too. However, automatic re-connection works for me. The credentials are stored. Neither Pidgin nor Spectrum should ask for the QR code scan over and over again. I hope we can find out what is going on.

heeplr commented 1 year ago

If you have any hint on how to further debug this, I'm all ears.

heeplr commented 1 year ago

btw. I also get a unpaired & a new pairing QR code when I simply restart spectrum2. When I use pidgin without spectrum2, I can reconnect without re-pairing.

Maybe that gives additional clues.

hoehermann commented 1 year ago

To purple (and Spectrum) WhatsApp (and other mobile-first messengers) do something weird: They do not ask the user for a password. Instead, they supply the credentials during the first login procedure. So instead of reading the credentials, the plug-in must set them. That has caused quite a bit of headache. In Spectrum more than Pidgin since Spectrum is a multi-user service. The way Spectrum handles the account information has been changed in https://github.com/SpectrumIM/spectrum2/commit/ee89203. From the commit message, I can tell that purple's file accounts.xml was previously disregarded (and deleted). But now is important. Does the accounts.xml exist in your set-up? After pairing, there should be an additional line containing the WhatsApp specific credentials.

heeplr commented 1 year ago

Thank you for the hint. I think we're on to something here.

Yes, the file exists and the <setting name='credentials' type='string'> ... </setting> line aswell. It changes everytime I'm having to pair again. I'm using the latest spectrum release (2.2.0) so I believe the file doesn't get deleted by spectrum.

I found something interesting: Credentials are in the form 49xxxxxx.0:nn@s.whatsapp.net|hhhhhhhh nn seems to be an incremental number corresponding to the amounts I had to repair. (you most probably know, just for the record)

Could it be that it get's initialized somewhere during startup, overwriting that nn number with 1? That would render the credentials invalid and could explain a failed login and thus triggering the repairing?

EDIT:

i found this in the logs:

May 22 09:15:50 server spectrum2_libpurple_backend[28403]: 28403: 2023-05-22 09:15:50,260 INFO  libpurple: util:Writing file accounts.xml to directory /var/lib/spectrum2/whatsapp.example.com
May 22 09:15:50 server spectrum2_libpurple_backend[28403]: 28403: 2023-05-22 09:15:50,261 INFO  libpurple: util:Writing file /var/lib/spectrum2/whatsapp.example.com/accounts.xml

Where do the credentials come from, that are written in the first line of the logs? (the one with nn=1)

EDIT2: This would also explain why many people don't experience my problem: If you never have to repair after the first time, nn stays 1 and the credentials remain valid.

heeplr commented 1 year ago

Ok, I found the problem: The credentials are not updated in the sqlite database.

When I manually update it with my current credentials from the accounts.xml file:

sqlite /var/lib/spectrum2/whatsapp.example.com/database.sql
sqlite> DELETE from users_settings where var = 'credentials';
sqlite> INSERT INTO users_settings VALUES(1,'credentials',13,'49xxxxxx.0:nn@s.whatsapp.net|hhhhhhhh');

I can restart spectrum2 willy-nilly without having to scan the QR code again. \o/ *yay*

I hope this is easy to fix. (Not creating a new issue since you most probably read this.) Thanks again for helping me to find the issue.

heeplr commented 1 year ago

(reopening, so I get notified about the fix)

hoehermann commented 1 year ago

Good to hear you found a workaround. 👍

The bad news is that I do not interfere with the database file explicitly – hence I cannot really do anything about the behaviour.

This comment says, I do not need to save the credentials: https://github.com/tulir/whatsmeow/blob/ba5da01/store/sqlstore/container.go#LL228C60-L228C60 Indeed, the function is called when pairing succeeds: https://github.com/tulir/whatsmeow/blob/ba5da01/pair.go#L158

I have no idea why it works in Pidgin, but does not work in Spectrum.

heeplr commented 1 year ago

I just chatted with the spectrum people. Turns out, the "credentials" setting in the database most probably was a stale artefact from pairing with an older spectrum version.

When I just delete it from the db, everything works as expected. So closing this.

Thanks again!