processone / ejabberd

Robust, Ubiquitous and Massively Scalable Messaging Platform (XMPP, MQTT, SIP Server)
https://www.process-one.net/en/ejabberd/
Other
6.01k stars 1.5k forks source link

mod_mam_sql - Malformed 'peer' field #4083

Closed licaon-kter closed 9 months ago

licaon-kter commented 10 months ago

Environment

[error] <0.XXX.0>@mod_mam_sql:make_archive_el/9:597 Malformed 'peer' field with value '' detected for user MUCNAME@conference.mydomain.tld in table 'archive': invalid JID

Did I miss any new sql change?

Parts of the config:

default_db: sql
sql_type: pgsql
...
new_sql_schema: true
sql_pool_size: 2
...
modules:
  mod_offline:
    access_max_user_messages: max_user_offline_messages
    store_groupchat: true
    use_mam_for_storage: true
...
append_host_config:
  "mydomain.tld":
    modules:
      mod_mam:
        assume_mam_usage: true
        default: always
        clear_archive_on_room_destroy: false
        compress_xml: true
      mod_muc:
        host: "conference.@HOST@"
        access:
          - allow
        access_admin:
          - allow: admin
        access_create: muc_create
        access_persistent: muc_create
        default_room_options:
          mam: true
          allow_subscription: true
        history_size: 1000
prefiks commented 10 months ago

Strange, we do convert those from jid record before storing in sql db, so shouldn't be empty. Would you be able to run query against db and see what select bare_peer from archive where peer = '' limit 10 returns? You probably could also make it work from ejabberd debug console by executing ejabberd_sql:sql_query(<<"my.domain">>, <<"select bare_peer from archive where peer = '' limit 10">>).

licaon-kter commented 10 months ago
{selected,[<<"bare_peer">>],
          [[<<>>],
           [<<>>],
           [<<>>],
           [<<>>],
           [<<>>],
           [<<>>],
           [<<>>],
           [<<>>],
           [<<>>],
           [<<>>]]}
prefiks commented 10 months ago

Hm, so is not that peer column got corrupted somehow but it looks like as both peer and bare_peer were probably set at this same time. Could you also run ejabberd_sql:sql_query(<<"my.domain">>, <<"select * from archive where peer = '' limit 5">>). . This will give message payload, so probably not that safe to share, but if you could even show me that with removed text content, maybe it will help with figuring out where this may be generated.

licaon-kter commented 10 months ago
{selected,[<<"username">>,<<"server_host">>,<<"timestamp">>,
           <<"peer">>,<<"bare_peer">>,<<"xml">>,<<"txt">>,<<"id">>,
           <<"kind">>,<<"nick">>,<<"created_at">>],
          [[<<"privatemucname@conference.mydomain.tld">>,
            <<"mydomain.tld">>,<<"1692743153147123">>,<<>>,<<>>,
            <<1,6,10,36,118,105,99,116,111,114,64,99,111,110,118,111,
              114,98,46,...>>,
            <<>>,<<"3547739">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:25:53.267192">>],
           [<<"privatemucname@conference.mydomain.tld">>,<<"mydomain.tld">>,
            <<"1692743349872870">>,<<>>,<<>>,
            <<1,6,10,46,101,108,101,110,97,99,111,110,115,116,97,110,
              116,105,...>>,
            <<>>,<<"3547787">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:29:09.874159">>],
           [<<"privatemucname@conference.mydomain.tld">>,<<"mydomain.tld">>,
            <<"1692743401763992">>,<<>>,<<>>,
            <<1,6,10,37,115,105,108,118,105,117,99,64,99,111,110,118,
              111,...>>,
            <<>>,<<"3547792">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:30:01.766409">>],
           [<<"privatemucname@conference.mydomain.tld">>,<<"mydomain.tld">>,
            <<"1692743701917254">>,<<>>,<<>>,
            <<1,6,10,40,97,108,101,120,97,110,100,114,97,98,64,99,...>>,
            <<>>,<<"3547797">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:35:01.918125">>],
           [<<"privatemucname@conference.mydomain.tld">>,<<"mydomain.tld">>,
            <<"1692745182998827">>,<<>>,<<>>,
            <<1,6,10,39,97,110,100,121,112,101,116,56,50,64,99,...>>,
            <<>>,<<"3547805">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:59:43.000249">>]]}

...if I set a higher limit I'll see other privatemucname2 and so on.... but over 26 it's the same output

prefiks commented 10 months ago

This is something recent right? You didn't seen this in older versions?

Could you also run rp(ejabberd_sql:sql_query(<<"my.domain">>, <<"select * from archive where peer = '' limit 3">>))., this will allow me to see what kind of message triggers this.

licaon-kter commented 10 months ago

Yup, I updated from 23.04.52 (not sure which commit is this) to 23.04.90 (https://github.com/processone/ejabberd/commit/00c76003cbd3ad81762bd581e878f15b6fc51036)

{selected,[<<"username">>,<<"server_host">>,<<"timestamp">>,
           <<"peer">>,<<"bare_peer">>,<<"xml">>,<<"txt">>,<<"id">>,
           <<"kind">>,<<"nick">>,<<"created_at">>],
          [[<<"privatemucname@conference.mydomain.tld">>,
            <<"mydomain.tld">>,<<"1692743153147123">>,<<>>,<<>>,
            <<1,6,10,36,118,105,99,116,111,114,64,99,111,110,118,111,
              114,98,46,105,109,47,67,111,110,118,101,114,115,97,116,
              105,111,110,115,46,121,54,84,117,3,12,2,31,2,1,5,99,97,
              108,100,63,4,4>>,
            <<>>,<<"3547739">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:25:53.267192">>],
           [<<"privatemucname@conference.mydomain.tld">>,<<"mydomain.tld">>,
            <<"1692743349872870">>,<<>>,<<>>,
            <<1,6,10,46,101,108,101,110,97,99,111,110,115,116,97,110,
              116,105,110,97,64,99,111,110,118,111,114,98,46,105,109,
              47,67,111,110,118,101,114,115,97,116,105,111,110,115,46,
              54,49,45,112,3,12,2,31,2,1,5,99,97,108,100,63,4,4>>,
            <<>>,<<"3547787">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:29:09.874159">>],
           [<<"privatemucname@conference.mydomain.tld">>,<<"mydomain.tld">>,
            <<"1692743401763992">>,<<>>,<<>>,
            <<1,6,10,37,115,105,108,118,105,117,99,64,99,111,110,118,
              111,114,98,46,105,109,47,67,111,110,118,101,114,115,97,
              116,105,111,110,115,46,73,67,110,107,3,12,2,31,2,1,5,99,
              97,108,100,63,4,4>>,
            <<>>,<<"3547792">>,<<"groupchat">>,<<"Nick">>,
            <<"2023-08-22 22:30:01.766409">>]]}
ok
prefiks commented 10 months ago

This is from ffa07c649ba185afbc79e00352b0f9bff9865a53.

prefiks commented 10 months ago

Could you try this patch:

From c004f7355baa5f8325811d88fc36481876549853 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Pawe=C5=82=20Chmielowski?= <pawel@process-one.net>
Date: Thu, 24 Aug 2023 11:54:26 +0200
Subject: [PATCH] Don't always store messages passed through muc_filter_message

Recently we added new places where we call muc_filter_message to add
occupandid info to messages, but this also made them be stored in archive
as mod_mam uses that hook for getting sent messages - in case of those
messages we shouldn't be doing this.

This should fix issue #4083
---
 src/mod_mam.erl      |  2 ++
 src/mod_muc_room.erl | 10 +++++-----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/mod_mam.erl b/src/mod_mam.erl
index 77dce5416..58e400ade 100644
--- a/src/mod_mam.erl
+++ b/src/mod_mam.erl
@@ -465,6 +465,8 @@ offline_message({_Action, #message{from = Peer, to = To} = Pkt} = Acc) ->

 -spec muc_filter_message(message(), mod_muc_room:state(),
             binary()) -> message().
+muc_filter_message(#message{meta = #{mam_ignore := true}} = Pkt, _MUCState, _FromNick) ->
+    Pkt;
 muc_filter_message(#message{from = From} = Pkt,
           #state{config = Config, jid = RoomJID} = MUCState,
           FromNick) ->
diff --git a/src/mod_muc_room.erl b/src/mod_muc_room.erl
index b6483ddab..e12b8aa66 100644
--- a/src/mod_muc_room.erl
+++ b/src/mod_muc_room.erl
@@ -625,7 +625,7 @@ normal_state({route, ToNick,
                                     Packet2 = xmpp:set_subtag(Packet, X),
                                     case ejabberd_hooks:run_fold(muc_filter_message,
                                                                  StateData#state.server_host,
-                                                                 Packet2,
+                                xmpp:put_meta(Packet2, mam_ignore, true),
                                                                  [StateData, FromNick]) of
                                         drop ->
                                             ok;
@@ -3016,7 +3016,7 @@ send_subject(JID, #state{subject_author = {Nick, AuthorJID}} = StateData) ->
              to = JID, type = groupchat, subject = Subject},
     case ejabberd_hooks:run_fold(muc_filter_message,
                                  StateData#state.server_host,
-                                 Packet,
+                                 xmpp:put_meta(Packet, mam_ignore, true),
                                  [StateData, Nick]) of
         drop ->
             ok;
@@ -5169,9 +5169,9 @@ process_iq_moderate(From, #iq{type = set, lang = Lang},
                      ]}]},
                {FromNick, _Role} = get_participant_data(From, StateData),
                     Packet = ejabberd_hooks:run_fold(muc_filter_message,
-                                                                 StateData#state.server_host,
-                                                                 Packet0,
-                                                                 [StateData, FromNick]),
+                            StateData#state.server_host,
+                            xmpp:put_meta(Packet0, mam_ignore, true),
+                            [StateData, FromNick]),
            send_wrapped_multiple(JID,
                      get_users_and_subscribers_with_node(?NS_MUCSUB_NODES_MESSAGES, StateData),
                      Packet, ?NS_MUCSUB_NODES_MESSAGES, StateData),
-- 
2.41.0

You can delete problematic messages with ejabberd_sql:sql_query(<<"my.domain">>, <<"delete from archive where peer = ''>>)., you shouldn't loose anything as those message do contain just subject sent to user or room join.

prefiks commented 10 months ago

And i guess you can trigger this by joining and leaving room couple times.

licaon-kter commented 10 months ago

So, what's the worflow?

prefiks commented 10 months ago

Yep

prefiks commented 10 months ago

Also you could try running rp(ejabberd_sql:sql_query(<<"my.domain">>, <<"select * from archive where peer = '' limit 3">>)). after some time and see if there is no new messages like that.

licaon-kter commented 10 months ago

ejabberd_sql:sql_query(<<"my.domain">>, <<"delete from archive where peer = ''>>). appears to not run, as if it is incomplete, pressing enter I get a new line with ..

prefiks commented 10 months ago

Make it ejabberd_sql:sql_query(<<"my.domain">>, <<"delete from archive where peer = ''">>).

licaon-kter commented 10 months ago

Ah, I was looking carefully but did not notice the missing ", anyway, result is: {updated,854} :+1:

Will update asap

licaon-kter commented 9 months ago

Not sure I saw this since.