42wim / matterircd

Connect to your mattermost or slack using your IRC-client of choice.
MIT License
294 stars 61 forks source link

Messages still showing out of order #426

Closed hloeung closed 1 year ago

hloeung commented 3 years ago

Hi,

Thought it was fixed with https://github.com/42wim/matterircd/pull/419 but it looks like it's still happening (as @axinojolais mentioned a few days ago):

Test one
```
Test two
Test three
```
Test four
Test five

With that, you can see that messages are received out of order below:

MAV2eqW7tg

Even without code blocks, for example:

Test one
Test two
Test three
Test four
Test five

Produces this:

U4OghjXOga

42wim commented 3 years ago

Possible to run with debug and paste the output?

hloeung commented 3 years ago

Sure, planning to try reproduce it when I have time and see if it's easily fixed. It may even be to do with the introduction of an additional IRC client/proxy (for me, it's irssi client -> irssi proxy -> matterircd. For @axinojolais, it's irssi client -> bip (?!?) -> matterircd).

FWIW, I've been running matterircd reverting commit 2c3883a69606e2c0e0d9cc3d22f06fae14d5050d and so far haven't noticed out-of-order messages. Anyways, will keep digging.

guilhermepiccoli commented 3 years ago

I am also facing this issue. My setup is hexchat <-> ZNC <-> matterircd. I do have debug enabled...will paste a "redacted" example below of the issue

guilhermepiccoli commented 3 years ago

ORIGINAL / Web-client

ONE TWO THREE FOUR FIVE SIX

MATTERIRCD

[1d4] ONE [1d5] TWO [1d6] THREE [1d7] FIVE [1d8] FOUR [1d9] SIX

Debug logs:

The issues seems to be that messages FOUR and FIVE got "glued" together, it shows as "FOUR\nFIVE" in the debug logs. Then some parsing split them, but throw them in reverse order.

I hope this is useful for the issue debugging! Cheers

redacted_dbg_logs.txt

hloeung commented 3 years ago

So I guess this is just fallout from switching to using buffered channels?

With the fix switching back to unbuffered channels, u.eventChan = make(chan *bridge.Event) or u.eventChan = make(chan *bridge.Event, 0)?

hloeung commented 3 years ago

IIUC, in handleWsActionPost() (https://github.com/42wim/matterircd/blob/master/bridge/mattermost/mattermost.go#L813), a single message with multiple lines is split into multiple messages and put into the buffered channel (https://github.com/42wim/matterircd/pull/383). Unfortunately, per https://stackoverflow.com/a/25795236, with buffered channels, we can't guarantee order of messages (the delivery?).

Correct me if I'm wrong.

42wim commented 3 years ago

Well, we have 1 goroutine handling them and the get put in in order, so that's why I don't understand it or I must be missing something.

@guilhermepiccoli logs above show the issue, a LIFO problem :-) I'm wondering why the FIVE and FOUR got glued together though.

hloeung commented 3 years ago

Well, we have 1 goroutine handling them and the get put in in order, so that's why I don't understand it or I must be missing something

Oh, just edited my previous comment as you were replying :)

I checked and we do indeed have only the 1 goroutine.

guilhermepiccoli commented 3 years ago

Thanks @42wim and @hloeung ! Is there anything else I can provide in order to get us to a resolution? Unfortunately my time is really full the next 2 weeks, or else I'd try to fix myself, hopefully not too complex

42wim commented 3 years ago

@hloeung are you sure you were testing with the version that has #419 in it ? I talked to @guilhermepiccoli and he was using an older version that didn't had the fix.

I also can not reproduce it with current master

hloeung commented 3 years ago

Yeah, definitely using a version with https://github.com/42wim/matterircd/pull/419 applied.

I wasn't able to easily reproduce it locally either. I'll have another go at reproducing it with debug/trace enabled this weekend.

hloeung commented 3 years ago

An update here, still haven't come across this yet and wasn't able to reproduce it myself.

I did stumble on https://github.com/42wim/matterircd/issues/420 with a PR to fix that though so all is not in vein.

42wim commented 3 years ago

Ok, i'm going to close this, if someone can reproduce it, feel free to reopen.

hloeung commented 3 years ago

Seen it today, with the latest matterircd. Sadly, while running without debug/trace logging.

@guilhermepiccoli did you end up sanitising the logs you had?

guilhermepiccoli commented 3 years ago

Hey @42wim and @hloeung ! Finally I was able to sanitize the logs I have, which shows the issue with 0.24.1, trace enabled - hope this is helpful.

ooo_msg_0.24.1-sanitized.txt

guilhermepiccoli commented 3 years ago

Also, can we reopen the issue @42wim ? Saw this issue a lot past few days, but only have the above log collection with trace enabled. Have a bunch with just the debug logs though...

Cheers!

42wim commented 3 years ago

@guilhermepiccoli and closed again.. no idea why it happens, I've now removed buffered channels for mattermost/slack. master should fix it

guilhermepiccoli commented 3 years ago

Thanks a bunch @42wim , gonna try now (will run the artifact generated by the test build of this commit).

axinojolais commented 3 years ago

I'm running with the patch above, and so far, no out-of-order messages.

Thanks !

furai commented 3 years ago

@guilhermepiccoli and closed again.. no idea why it happens, I've now removed buffered channels for mattermost/slack. master should fix it

I'll give it a try as well. I've been experiencing a lot of out of order messages lately.

As to why it got closed (if you were actually asking about it, if not, I'm sorry for providing explanation anyway) - it's because of the commit message. If you put in it keywords like Fixes #21312312 then when it's gets included in the main branch - it will automatically close the mentioned issue. Source.

guilhermepiccoli commented 3 years ago

Unfortunately, I'm still seeing the issues even with @42wim 's patch =/

furai commented 3 years ago

Yeah, I'm also experiencing out of order messages. It's like if they get grouped by mattermost app into one then they appear in wrong order in matterircd. Probably enough to reproduce is to type 2 messages in quick succession on mattermost web or maybe inside the mattermost app. If they get grouped then they appear in IRC in reversed order.

furai commented 3 years ago

In web UI: image

On IRC: image

I'm using latest release build

furai commented 3 years ago

Adding to the above - the messages sent by this user were most likely sent using desktop mattermost app.

42wim commented 3 years ago

Can the people (@hloeung, @guilhermepiccoli, @furai ) that still have the issue download this special build: https://github.com/42wim/matterircd/pull/445/checks?check_run_id=2985395608 (and select your correct artifact)

https://github.com/42wim/matterircd/suites/3155567672/artifacts/72567237 should be a direct link to the 64-bit linux one.

Now add ootrace=true (same way as trace=true) in your matterircd.toml (so not under the mattermost header) You don't need to run with full --debug, the ootrace option will show info messages with timestamps of every (multiline) message (and content) going to the eventchan and received from the eventchan on the irc side.

Hopefully this way we can see if there's a channel out of order issue or something else, if you get hit by the issue, paste the last OOTRACE lines here.

hloeung commented 3 years ago

Per IRC, I haven't come across or noticed it since the latest patch. I've pulled in the latest with OOTRACE logging anyways. Will update if I notice.

Thanks!

hloeung commented 3 years ago

Okay, here it is:

INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ${MSG1}  to eventChan on 1625646007267419568: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ``` to eventChan on 1625646007267530504: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ${MSG2} to eventChan on 1625646007267579861: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ``` [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007268263014  module=matterircd
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ${MSG1} [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007268319657  module=matterircd
INFO[2021-07-07T08:20:07Z] OOTRACE: sending msg ``` to eventChan on 1625646007272937234: 0
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ${MSG2} [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007273574391  module=matterircd
INFO[2021-07-07T08:20:07Z] OOTRACE: ${USER} sent ${MSG1}
```
${MSG2}
```
INFO[2021-07-07T08:20:07Z] OOTRACE: got msg ``` [@@mpdc9qt9zi88jjqnuartbkc8ec] on 1625646007275923369  module=matterircd
hloeung commented 3 years ago

Another:

INFO[2021-07-07T05:31:43Z] OOTRACE: sending msg ${MSG1} to eventChan on 1625635903100315921: 0
INFO[2021-07-07T05:31:43Z] OOTRACE: sending msg ${MSG2} to eventChan on 1625635903100436639: 0
INFO[2021-07-07T05:31:43Z] OOTRACE: sending msg ${MSG3} to eventChan on 1625635903100489927: 0
INFO[2021-07-07T05:31:43Z] OOTRACE: got msg ${MSG2} [↪@@57z8xex4m78ijfuqke6yk6ha4e] on 1625635903101329460  module=matterircd
INFO[2021-07-07T05:31:43Z] OOTRACE: got msg ${MSG1} [↪@@57z8xex4m78ijfuqke6yk6ha4e] on 1625635903101347667  module=matterircd
INFO[2021-07-07T05:31:43Z] OOTRACE: ${USER} sent ${MSG1}

${MSG2}

${MSG3}

INFO[2021-07-07T05:31:43Z] OOTRACE: got msg ${MSG3} [↪@@57z8xex4m78ijfuqke6yk6ha4e] on 1625635903107285778  module=matterircd
42wim commented 3 years ago

@hloeung does MSG1 MSG2 or MSG3 contain ``` ? or a reaction?

I have no idea btw how this is possible, msg's get set on the channel in order but get read out of order.

42wim commented 3 years ago

@hloeung and other peeps, can you try https://github.com/42wim/matterircd/pull/445/checks?check_run_id=3041968404 ? https://github.com/42wim/matterircd/suites/3211116698/artifacts/74279412 direct linux 64-bit build.

The change here is to remove the buffered channel completely although this shouldn't make any difference because it's getting set to 0 when initializing mattermost/slack but maybe I'm missing something :)

If you're seeing OOTRACE: creating non-buffered eventchan on start-up when connected to mattermost you're running the correct version.

hloeung commented 3 years ago

Yep, upgraded yesterday and am running with the latest patch since then.

[ubuntu@irc-proxy mattermost]$ ./matterircd
INFO[2021-07-12T10:16:20Z] Running version 0.24.3-dev                    module=matterircd                                                                                                                                                              
INFO[2021-07-12T10:16:20Z] WARNING: THIS IS A DEVELOPMENT VERSION. Things may break.  module=matterircd
INFO[2021-07-12T10:16:20Z] Listening on 127.0.0.1:6697                   module=matterircd                                                                                                                                                              
INFO[2021-07-12T10:17:11Z] New connection: 127.0.0.1:44932               module=matterircd
INFO[2021-07-12T10:17:11Z] OOTRACE: creating non-buffered eventchan on start-up  module=matterircd                                                                                                                                                      
slack-go/slack2021/07/12 10:17:11 websocket_managed_conn.go:181: Connecting to RTM
...

I can see a difference in the logs already. Seems to be got, sending, got, sending, got, sending vs. previously it's bunch of gots followed by bunch of sendings.

Will keep watching.

42wim commented 3 years ago

so fixed now? I'll do a release tomorrow then.

hloeung commented 3 years ago

Yeah, I think so. Not noticed any more OOO messages.

hloeung commented 3 years ago

Spoke too soon.

On IRC:

|20:30 <user1> ```bash [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> mount error 110 = Connection timed out [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> msg1 [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> ``` [ui1pfumdkf8nmkwd8zwc46zqch]
|20:30 <user1> $ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest [ui1pfumdkf8nmkwd8zwc46zqch]

Matterircd OOTRACE logs:

INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg msg1 to eventChan on 1626813013329295391: 0
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg ```bash to eventChan on 1626813013329423442: 0                                                                                                                                                          
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg $ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest to eventChan on 1626813013329487053: 0                                                                                                 
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg mount error 110 = Connection timed out to eventChan on 1626813013329544542: 0                                                                                                                           
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg ```bash [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013331397999 module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg msg1 [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013331004892 module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg $ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013331507745  module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg mount error 110 = Connection timed out [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013341611377  module=matterircd
INFO[2021-07-20T20:30:13Z] OOTRACE: sending msg ``` to eventChan on 1626813013345213396: 0
INFO[2021-07-20T20:30:13Z] OOTRACE: user1 sent msg1
```bash
$ sudo mount -t ceph :/testdir /mnt/cephfs -o name=bootstacktest                                                                                                                                                                                        
mount error 110 = Connection timed out
```
INFO[2021-07-20T20:30:13Z] OOTRACE: got msg ``` [@@ui1pfumdkf8nmkwd8zwc46zqch] on 1626813013348128027  module=matterircd

So wtf?!?!?

hloeung commented 3 years ago

Still happening, a lot less frequent, but still seeing it. Most recent being:

INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 39       2 ftpuser to eventChan on 1628810281942359650: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sent     msg 39       2 ftpuser to eventChan on 1628810281948257769: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 40       2 deploy to eventChan on 1628810281948595596: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: got      msg 40       2 deploy [@@3sbkx9gjypgg5d1f7frf79xfxc] on 1628810281953273716  module=matterircd
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: got      msg 39       2 ftpuser [@@3sbkx9gjypgg5d1f7frf79xfxc] on 1628810281954677351  module=matterircd
WARN[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: Message out of order, got msg 39       2 ftpuser [@@3sbkx9gjypgg5d1f7frf79xfxc] but seen 40  module=matterircd
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sent     msg 40       2 deploy to eventChan on 1628810281955665837: 0                                                                                                     INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 41       2 chia to eventChan on 1628810281955869657: 0
INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sent     msg 41       2 chia to eventChan on 1628810281955993984: 0                                                                                                       INFO[2021-08-12T23:18:01Z] OOTRACE: #town-square: 3sbkx9gjypgg5d1f7frf79xfxc: sending  msg 42       2 centos to eventChan on 1628810281956149911: 0

Weird, I'm at a loss at where this would be.

furai commented 3 years ago

Still no fix for this?

42wim commented 2 years ago

@furai well, if we don't know the root cause we can't fix it :(

42wim commented 1 year ago

482 should fix this as we're not splitting the lines from mattermost anymore before sending it to irc

So anyone still interested, please try this PR :)