Closed fluca1978 closed 3 months ago
We should def error out if the configuration is the same.
However, it should be possible to run multiple instance on the same server - like one for the primary instance, and another for standby
We should def error out if the configuration is the same.
The only "quick" way to find out if the configuration is (almost) the same is the failure of bind
or the same usage of the managament socket (and it could be also the metrics one). If any of these is already in use, we should abort.
However, it should be possible to run multiple instance on the same server - like one for the primary instance, and another for standby
Good point, but while it is immediate to find out a "misrun" by the user when using the same configuration for multiple instances, if the pid_file
is absolute, this becomes harder to detect if the pid file is relative (until we fix the above socket problems).
@fluca1978 I couldn't reproduce this, but I may have misunderstood the bug.
What I am trying to do is set the unix_socket_dir
to a relative path in config and then run two pgagroal instances from different directories. What I get is a bind error.
If you could give me more details I could work something out.
@fluca1978 I couldn't reproduce this, but I may have misunderstood the bug.
What I am trying to do is set the
unix_socket_dir
to a relative path in config and then run two pgagroal instances from different directories. What I get is a bind error.If you could give me more details I could work something out.
When I launch the second instance, I got a bind error too, but the instance continues to run. Is your second instance aborting? That could be due to the presence or absence of other network cards?
When I launch the second instance, I got a bind error too, but the instance continues to run. Is your second instance aborting?
Yes, mine aborts exactly after returning from pgagroal_bind
function.
$ ./pgagroal -c pgagroal.conf
2024-04-03 10:54:45 DEBUG configuration.c:2656 PID file automatically set to: [./pgagroal.2345.pid]
2024-04-03 10:54:45 DEBUG network.c:648 server: bind: localhost:2345 (Address already in use)
2024-04-03 10:54:45 FATAL main.c:924 pgagroal: Could not bind to localhost:2345
That could be due to the presence or absence of other network cards?
I have researched this and it's possible that it is a matter of how the OS deals with SO_REUSEADDR
, but I need to research more. Do you have details on the address-port pairs that were bound on each processes?
Apprently the problem is with `host
configuration: if set to localhost
the second instance aborts as expected:
% pgagroal
-> DEBUG network.c:648 server: bind: localhost:54322 (Address already in use)
-> DEBUG network.c:648 server: bind: localhost:54322 (Address already in use)
-> FATAL main.c:924 pgagroal: Could not bind to localhost:54322
but when set to *
the second instance runs.
On 8a1d6416f02033a7a307439983ee6982513f5bc5 having
pid_file = relative_file.pid
makespgagroal
able to run multiple times from different directories.Example: running the first instance:
Running the second instance from a different directory:
I think we should either force an absolute
pid_file
, therefore aborting execution ifpid_file
is not absolute, or abort the execution if thebind
fails. In any case, the fact thatbind
failure allows for continuation is suspicions.