Ejabberd is not able to serve the appropriate cert

lokesh411 commented 1 year ago

Environment

ejabberd version: 22.05
Erlang version: 24
OS: Linux (Debian)
Installed from: official deb/rpm

Configuration:

certfiles:
 - abc.com # given by godaddy
 - xyz.com # given by lets encrypt for *.xyz.com

Bug description

Ejabberd is not able to serve the appropriate cert if multiple certs. If I am making a request to ejabberd with r0.xyz.com as the SNI, then it serves cert given toabc.com

licaon-kter commented 1 year ago

ejabberd.log has any info?

You made sure ejabberd can access the cert files?

lokesh411 commented 1 year ago

Yeah ejabberd is the owner of the certificates Also, i don't find anything wierd in the logs

On Wed, 18 Oct, 2023, 12:33 am Licaon_Kter, @.***> wrote:

ejabberd.log has any info?

You made sure ejabberd can access the cert files?

— Reply to this email directly, view it on GitHub https://github.com/processone/ejabberd/issues/4102#issuecomment-1766996471, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGLDOPD6WHDYB3I7OXVR5P3X73JABAVCNFSM6AAAAAA6EKBC6CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWHE4TMNBXGE . You are receiving this because you authored the thread.Message ID: @.***>

EOUpSL93Y commented 8 months ago

May be it's related to this issue, so - i have the same problem. Debian (10-12), ejabberd (few versions from Debian repo, latest - 24.02 - official deb; so - at least all 2* is affected); 3 domains (JW is the first domain in hosts). Nothing in logs; usually it's happened after certificates update (permissions are ok) - just after ejabberdctl reload_config SOMETIMES sending wrong certificate for SOME domain and SOME port. Usually i can fix it with one more reload_config. Just for example (same certs, nothing changed in config file, just launched ejabberdctl reload_config ~10 times):

C2S TLS OK:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name

S2S TLS WRONG:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = jabberworld.info
jabber.name: subject=CN = jabberworld.info

First reload_config - same result:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = jabberworld.info
jabber.name: subject=CN = jabberworld.info

Second - now all ok:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name
rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null  2>&1 | grep ^subject=CN) ; done
jabberworld.info: subject=CN = jabberworld.info
linuxoid.in: subject=CN = linuxoid.in
jabber.name: subject=CN = jabber.name

3-10 reload_config - all ok.
Ok, let's change something. Added a space to /etc/ejabberd/certs/jabber.name.fullchain.pem in between lines:
```
-----END CERTIFICATE-----
```

-----BEGIN CERTIFICATE-----

Permissions are ok:

ls -l /etc/ejabberd/certs/jabber.name.fullchain.pem -rw-r----- 1 root ejabberd 5852 Mar 24 20:16 /etc/ejabberd/certs/jabber.name.fullchain.pem

Tried reload_config 7 times - all ok.
* Ok, let's change something else. Added a "#" to config file (to already commented line at the beginning of file - so now it's just a '##').
1st reload_config - all ok (!)
2nd - got same problem again:

rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5223 </dev/null 2>&1 | grep ^subject=CN) ; done jabberworld.info: subject=CN = jabberworld.info linuxoid.in: subject=CN = linuxoid.in jabber.name: subject=CN = jabber.name rain@walkbook:~$ for i in jabberworld.info linuxoid.in jabber.name ; do echo ${i}: $(openssl s_client -connect $i:5270 </dev/null 2>&1 | grep ^subject=CN) ; done jabberworld.info: subject=CN = jabberworld.info linuxoid.in: subject=CN = jabberworld.info jabber.name: subject=CN = jabberworld.info


3..7 - all ok.
* Removed that comment.
1st reload_config - got a problem.
2nd - ok
3rd - ok
...

licaon-kter commented 8 months ago

@EOUpSL93Y are you the admin of linuxoid.in ?

EOUpSL93Y commented 8 months ago

Yes

weiss commented 3 months ago

Next time you run into the issue, could you check whether running the following call in an ejabberdctl debug shell fixes the issue:

fast_tls:clear_cache().

(Including the trailing .. Press Ctrl+g and then q to exit the shell.)

As an alternative, you could use a script such as the following (if escript is in the path, and assuming the node name is ejabberd@localhost):

#!/usr/bin/env escript
%%! -sname fix-cert@localhost

-define(NODE, 'ejabberd@localhost').

-spec main([string()]) -> any().
main(_Args) ->
  try ok = erpc:call(?NODE, fast_tls, clear_cache, [])
  catch error:{erpc, Reason} ->
      io:fwrite(standard_error, "Cannot query ejabberd: ~p~n", [Reason]),
      halt(1)
  end.

mremond commented 3 months ago

Maybe having a clear cache ejabberdctl command would be handy ? It could a hook that module could register to. It could take a specific parameter to only clear the cache of a single module.

@weiss Is this a good idea ?

prefiks commented 3 months ago

We already do that in reload_config, so probably no need to add separate command just for that (and we call that clear cache function from reload_config). Now the question is why there are sometimes stale results...

weiss commented 3 months ago

The problem being related to fast_tlss caching was just a blind guess of mine. I suggested that test to make sure nothing else is reloaded, just to track things down.

EOUpSL93Y commented 3 months ago

Got this problem again just after upgrade to 24.07, then tried

fast_tls:clear_cache().

and it fixed the problem!

processone / ejabberd