esl / MongooseIM

MongooseIM is Erlang Solutions' robust, scalable and efficient XMPP server, aimed at large installations. Specifically designed for enterprise purposes, it is fault-tolerant and can utilise the resources of multiple clustered machines.
Other
1.64k stars 422 forks source link

gen_server 'wpool_pool-mongoose_wpool$rdbms$global$default-1' terminated with reason: {{{integer_overflow,int8,957926528636579656073}, #4222

Open sahkaa opened 3 months ago

sahkaa commented 3 months ago

MongooseIM version: (put the version) Installed from: source Erlang/OTP version: 25.3.2.6 last source commit: 1ada930cbe2333e3a5a7c43fe786443bef82bea2 Describe the issue.

Hello! I'm facing issue. I think it could be related to mam.

07:08:41.444 [error] gen_server 'wpool_pool-mongoose_wpool$rdbms$global$default-1' terminated with reason: {{{integer_overflow,int8,957926528636579656073},[{epgsql_codec_integer,overflow,2,[{file,"/tmp/mongooseim/_build/default/lib/epgsql/src/datatypes/epgsql_codec_integer.erl"},{line,44}]},{epgsql_codec_integer,encode,3,[{file,"/tmp/mongooseim/_build/default/lib/epgsql/src/datatypes/epgsql_codec_integer.erl"},{line,58}]},{epgsql_binary,encode_value,2,[{file,"/tmp/mongooseim/_build/default/lib/epgsql/src/epgsql_binary.erl"},{line,271}]},{epgsql_wire,encode_parameter,3,[{file,"/tmp/mongooseim/_b..."},...]},...]},...} in gen_server:call/3 line 385

Log removed by @arcusfelis here

arcusfelis commented 3 months ago

Interesting, looks like MAM RSM query has troubles to be encoded: {rsm_in,50,aft,957926528636579656073,undefined}. That int is a bigint MAM message ID. But pgsql reported that it is an intereger of 0..255 range.

Please, write me the pgsql version (exact version, ideally from the docker hub). so, we can run our tests with it. Also, do you use the default SQL schema?

Also, it could be that our code incorrectly detects the type of the count when making a prepared query.

Could you also provide an XML stanza you send with this mam select query? because it looks like we do not log the stanza IQ when it causes a MAM backend errror.

when=2024-02-08T07:08:41.447449+00:00 level=error what=hook_failed reason="{{{{integer_overflow,int8,957926528636579656073},[{epgsql_codec_integer,overflow,2,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/datatypes/epgsql_codec_integer.erl\"},{line,44}]},{epgsql_codec_integer,encode,3,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/datatypes/epgsql_codec_integer.erl\"},{line,58}]},{epgsql_binary,encode_value,2,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/epgsql_binary.erl\"},{line,271}]},{epgsql_wire,encode_parameter,3,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/epgsql_wire.erl\"},{line,290}]},{epgsql_wire,encode_parameters,5,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/epgsql_wire.erl\"},{line,268}]},{epgsql_cmd_prepared_query,execute,2,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/commands/epgsql_cmd_prepared_query.erl\"},{line,38}]},{epgsql_sock,command_exec,4,[{file,\"/tmp/mongooseim/_build/default/lib/epgsql/src/epgsql_sock.erl\"},{line,383}]},{gen_server,try_handle_call,4,[{file,\"gen_server.erl\"},{line,1149}]}]},{gen_server,call,[<0.1838.0>,{command,epgsql_cmd_prepared_query,{{statement,[<<\"mam_message_count_u_leiequeqb_all\">>],[{column,<<\"count\">>,int8,20,8,-1,1,0,0}],[int8,int4,text],[{20,int8,false},{23,int4,false},{25,text,false}]},[{int8,957926528636579656073},{int4,922},{text,<<\"ab21-4238\">>}]}},infinity]}},{gen_server,call,['wpool_pool-mongoose_wpool$rdbms$global$default-1',{sql_cmd,{sql_execute,mam_message_count_u_leiequeqb_all,[957926528636579656073,922,<<\"ab21-4238\">>]},-576460040244},60000]}}" pid=<0.1988.0> at=gen_hook:error_running_hook/5:254 stacktrace="gen_server:call/3:385 mongoose_rdbms:execute_successfully/3:261 mam_lookup:calc_count/2:181 mam_lookup:lookup_messages_regular/4:132 mod_mam_rdbms_arch:lookup_messages/3:406 gen_hook:apply_hook_function/3:251 gen_hook:run_hook/4:237 mongoose_hooks:run_fold/4:1440" text="Error running hook" params_with_jid="{jid,<<\"ab21-4238\">>,<<\"msg.sashkas.com\">>,<<>>}" params_start_ts=undefined params_search_text=undefined params_rsm={rsm_in,50,aft,957926528636579656073,undefined} params_page_size=50 params_owner_jid="{jid,<<\"93f8-2852\">>,<<\"msg.sashkas.com\">>,<<>>}" params_ordering_direction=forward params_now=1707376121435724 params_max_result_limit=50 params_limit_passed=true params_is_simple=false params_end_ts=undefined params_caller_jid="{jid,<<\"93f8-2852\">>,<<\"msg.sashkas.com\">>,<<\"eh_iYT7T\">>}" params_borders=undefined params_archive_id=922 key="{mam_lookup_messages,<<\"msg.sashkas.com\">>}" handler="{hook_handler,50,fun mod_mam_rdbms_arch:lookup_messages/3,#{hook_name => mam_lookup_messages,hook_tag => <<\"msg.sashkas.com\">>,host_type => <<\"msg.sashkas.com\">>}}" when=2024-02-08T07:08:41.447425+00:00 level=error what=hook_failed reason="{{{{integer_overflow,int8,957926528636579656073},[{epgsql_codec_integer,overflow,2,[{file,\"/tmp/mongoo

arcusfelis commented 3 months ago

Also, change your DB password, it is inside the log. You should sanitize logs before posting them.

Log link is removed from your issue by me, but it is still uploaded to github.

arcusfelis commented 3 months ago

SQL query, works fine for me:

 {mam_message_count_u_leiequeqb_all,<<"mam_message">>,
                                    [<<"id">>,<<"user_id">>,<<"remote_bare_jid">>],
                                    <<"SELECT  COUNT(*) FROM mam_message  WHERE id <= ? AND user_id = ? AND remote_bare_jid = ? ">>},

Now waiting for your PgSQL version.

XML IQ:

<iq type='set' id='84fda6f59dea6a2d022721dad3e46cf0'>
    <query xmlns='urn:xmpp:mam:1'>
        <x xmlns='jabber:x:data' type='submit'>
            <field var='with'>
                <value>kate_querying_for_all_messages_with_jid_after_17@localhost</value>
            </field>
        </x>
        <set>
            <max>50</max>
            <after>C487F52688G1</after>
        </set>
    </query>
</iq>

Ah, wait:

 mod_mam_utils:mess_id_to_external_binary(957926528636579656073).
<<"PURPLEFE965CC9">>

ok, I see. You trying to provide something too big to be represented as a bigint of size 8 bytes, after decoding from base32.

Check your code, you are probably supplying invalid mam ids as an input... We could add the format validation to MongooseIM though here.

arcusfelis commented 3 months ago

It just returns an empty result set and crashes on the server.

Apparently, that PR should've been wider to also cover SQL query failing: https://github.com/esl/MongooseIM/pull/4191/files

Two tasks for MIM would be added into a backlog:

Also: