Closed regalialong closed 6 months ago
Try adding contact, and I also do this iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 5280 && ejabberdctl request-certificate all && ejabberdctl list-certificates && iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 5280
acme:
## Staging environment
contact: mailto:post@vivaldi.net
# ca_url: https://acme-staging-v02.api.letsencrypt.org/directory
# auto: false
## Production environment (the default):
ca_url: https://acme-v02.api.letsencrypt.org/directory
auto: false
# ca_url: https://acme-v02.api.letsencrypt.org
I added contact but this doesn't fix it. You don't have this issue, assuming the configuration file you linked is what you are using, you are using Let's Encrypt's endpoints which provide access over IPv4, the internal ACME server of mine does not have an IPv4 endpoint and fails to connect. Contact missing would not cause nxdomain.
I hate doing this but bump this, I think the fix is practically just setting httpc:set_options([{ipfamily, inet6}])
. This is really showstopping ejabberd for us.
httpc:set_options([{ipfamily, inet6}]).
But if this is used when the server is IPv4-only, the query will fail, right?
Looking at the documentation, I found that the ipfamily
option supports inet6fb4
since at leat Erlang 26:
With
inet6fb4
option, IPv6 will be preferred but if connection fails, an IPv4 fallback connection attempt will be made.
In Erlang/OTP 26 and 27.0-rc2 it is documented:
IpFamily = inet | inet6 | local | inet6fb4
Using Erlang 27.0-rc2:
$ erl -s inets -s ssl
Erlang/OTP 27 [erts-14.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [jit:ns]
Eshell V14.3 (press Ctrl+G to abort, type help(). for help)
1> httpc:set_options([{ipfamily, inet6fb4}]).
ok
2> httpc:request("https://ipv6.google.com").
{error,{failed_connect,[{to_address,{"ipv6.google.com",443}},
{inet6,[inet6],enetunreach},
{inet,[inet],nxdomain}]}}
In Erlang/OTP 25.0 Release Candidate 3 that option was not yet documented:
IpFamily = inet | inet6 | local (unix socket)
However, looking at the Erlang source code, I suspect that this option is supported since many more versions ago (at least since Erlang 20.3), even if it was not documented:
Erlang/OTP 20 [erts-9.3.3.15] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V9.3.3.15 (abort with ^G)
1> ssl:start().
ok
2> inets:start().
ok
3> httpc:set_options([{ipfamily, inet6fb4}]).
ok
4> httpc:request("https://ipv6.google.com").
{error,{failed_connect,[{to_address,{"ipv6.google.com",443}},
{inet6,[inet6],enetunreach},
{inet,[inet],nxdomain}]}}
Assuming that the ipv6fb4 is available and works correctly since at least Erlang 20 (this is the oldest version supported by ejabberd and p1_acme), then a patch like this could be safely added to p1_acme:
diff --git a/src/p1_acme.erl b/src/p1_acme.erl
index b80f539..8e53502 100644
--- a/src/p1_acme.erl
+++ b/src/p1_acme.erl
@@ -598,6 +598,7 @@ http_request(State, ReqFun, RetryTimeout) ->
Timeout ->
{Method, URL} = Request = ReqFun(State),
?DEBUG("HTTP request: ~p", [Request]),
+ httpc:set_options([{ipfamily, inet6fb4}]),
case httpc:request(Method, URL,
[{timeout, infinity},
{connect_timeout, infinity}],
A better way to test if inet6fb4
does not break IPv4-only uses is to try connecting to an IPv4-only host:
> httpc:set_options([{ipfamily, inet6}]).
ok
> httpc:request("https://ipv4.google.com").
{error,{failed_connect,[{to_address,{"ipv4.google.com",443}},
{inet6,[inet6],nxdomain}]}}
> httpc:set_options([{ipfamily, inet6fb4}]).
ok
> httpc:request("https://ipv4.google.com").
{ok,{{"HTTP/1.1",200,"OK"},
a patch like this could be safely added to p1_acme
I would suggest this (untested) which only sets the options once and doesn't interfere with global/default httpc profile:
diff --git a/src/p1_acme.erl b/src/p1_acme.erl
index b80f539..a648bea 100644
--- a/src/p1_acme.erl
+++ b/src/p1_acme.erl
@@ -113,6 +113,9 @@
%%% API
%%%===================================================================
start() ->
+ application:start(inets),
+ inets:start(httpc, [{profile, p1_acme}]),
+ httpc:set_options([{ip_family, inet6fb4}], p1_acme),
case application:ensure_all_started(?MODULE) of
{ok, _} -> ok;
Err -> Err
@@ -602,7 +605,7 @@ http_request(State, ReqFun, RetryTimeout) ->
[{timeout, infinity},
{connect_timeout, infinity}],
[{body_format, binary},
- {sync, false}]) of
+ {sync, false}], p1_acme) of
{ok, Ref} ->
ReqTimeout = min(timer:seconds(10), Timeout),
receive
@@ -612,7 +615,7 @@ http_request(State, ReqFun, RetryTimeout) ->
ReqFun, Response, State, RetryTimeout)
after ReqTimeout ->
?DEBUG("HTTP request timeout", []),
- httpc:cancel_request(Ref),
+ httpc:cancel_request(Ref, p1_acme),
http_request(State, ReqFun, RetryTimeout)
end;
{error, WTF} ->
Thanks! I've tested your patch and seems to work, so I've applied it to p1_acme, and updated ejabberd to use it.
It would be great if somebody else can test the feature and report the results, @regalialong ;)
There are binary installers and the container image.
I pulled the deb package from the actions run you have linked and ejabberd_acme appears to crash for me:
Everything beyond what I sent dumps what looks like entire certificate store. The system does trust our CA certificate (curl and others work), same happens no matter if I specify ca_file or comment it out.
I also ran certbot on the same machine to verify that ACME works and that renews successfully as well.
Aha! I can reproduce the problem, ejabberd crashes on start.
The problem is that the function p1_acme:start is not called, so the httpc_p1_acme process is not started, and that one is required since the recent patch to perform the HTTP queries, as it contains the specific HTTP options.
I've applied an additional patch to p1_acme, and built installers in my ejabberd fork: https://github.com/badlop/ejabberd/actions/runs/8802053312
Now ejabberd starts for me. Can you try that? Once you confirm it works perfectly, I'll merge into ejabberd upstream.
Apr 23 15:35:07 xmpp ejabberdctl[65391]: 2024-04-23 15:35:07.370501+00:00 [info] Certificate for xmpp.demik, pubsub.xmpp.demik and 3 more hosts has been received, stored and loaded successfully
Yep, that works fully now. Thank you very much for the help :)
Ok, ejabberd already points to the latest p1_acme, the next version will include it. Thanks all for the report, patches and testing!
Environment
Configuration (only if needed): grep -Ev '^$|^\s*#' ejabberd.yml
Errors from error.log/crash.log
Bug description
I have a IPv6-only VPN with internal TLD and I wanted to try out ejabberd. I have an ACME server at
https://pki.demik
which I want to get the certificates for the XMPP server for but ejabberd fails to connect to it withnon-existing domain
which doesn't make sense because the rest of the system resolves and connects fine.Checking logs on debug loglevel shows the connection to the ACME server fails:
Trying to diagnose this in the debug shell,
inet_res:getbyname("pki.demik", aaaa).
fails which is fixed adding{inet6, true}.
to inetrc. ACME still fails with non-existing domain though.I think this is happening because httpc is at the default of
inet
foripfamily
instead ofinet6
, connections to ipv6.google.com similarly fail with nxdomain.Setting inet6 makes the same request work:
And also makes the connection to the ACME server work with
ejabberdctl request_certificate all
:Is this an oversight or is this just an unsupported usecase?