sony / nmos-cpp

An NMOS (Networked Media Open Specifications) Registry and Node in C++ (IS-04, IS-05)
Apache License 2.0
136 stars 79 forks source link

Quick restart of the NMOS Registry - can produce "asio listen error: system:98 (Address already in use)" #348

Open macphersonjamie opened 9 months ago

macphersonjamie commented 9 months ago

Hi

I've notice an issue when stopping, and then restarting the registry quickly afterwards.

I've seen a case where the port used for web subscriptions is not served up:-

I have seen following trace in the log at startup - which I believe is related to problem and where the registry fails to bind to the port:

2023-10-05 10:10:38.381: info: 139626509403904: asio listen error: system:98 (Address already in use)

The problem can be avoided by just allowing enough time between the stopping and the restarting of the registry.

However I did think that this was probably a valid case, when the user would want the registry to quickly restart , if it had failed for some reason.

I wondered in this case if the option SO_REUSEADDR - could be used on the socket ? -

thanks

Jamie

Starting with output to console 2023-10-05 10:10:38.247: info: 139626824233600: Starting nmos-cpp registry 2023-10-05 10:10:38.252: info: 139626824233600: Process ID: 196 2023-10-05 10:10:38.252: info: 139626824233600: Build settings: cpprestsdk/2.10.18 (listener=asio; client=asio); WebSocket++/0.8.2; Boost 1.80.0; OpenSSL 1.1.1s 1 Nov 2022 2023-10-05 10:10:38.252: info: 139626824233600: Initial settings: {"admin_port":41000,"host_address":"10.1.215.10","host_addresses":["10.1.215.10","192.168.100.1","172.17.0.1","192.168.104.1"],"http_port":41000,"http_trace":false,"label":"homedev-registry","logging_level":-20,"logging_port":41000,"mdns_port":41000,"node_port":41000,"pri":99,"query_port":41000,"query_ws_port":41001,"registration_expiry_interval":12,"registration_port":41000,"schemas_port":41000,"seed_id":"20b4e60d-0c46-4ffc-95db-5267f4df814a","settings_port":41000,"system_port":41000} 2023-10-05 10:10:38.252: info: 139626824233600: Configuring nmos-cpp registry with its primary Node API at: 10.1.215.10:41000 2023-10-05 10:10:38.252: info: 139626824233600: Configuring nmos-cpp registry with its primary Registration API at: 10.1.215.10:41000 2023-10-05 10:10:38.252: info: 139626824233600: Configuring nmos-cpp registry with its primary Query API at: 10.1.215.10:41000 2023-10-05 10:10:38.378: info: 139626824233600: Preparing for connections 2023-10-05 10:10:38.381: info: 139626509403904: asio listen error: system:98 (Address already in use)
2023-10-05 10:10:38.381: info: 139626509403904: listening with IPv6 failed; retrying with IPv4 only 2023-10-05 10:10:38.382: info: 139626509403904: asio listen error: system:98 (Address already in use)
2023-10-05 10:10:38.382: info: 139626824233600: Ready for connections

garethsb commented 9 months ago

Hi Jamie,

Hmm, I thought we'd fixed that, see https://github.com/sony/nmos-cpp/issues/320

macphersonjamie commented 9 months ago

Hi Gareth

That's good to know that it been addressed,

Hmm - We did see this when running up an easy-nmos container in host mode - so I wonder if we had the latest version of easy-nmos container - which I thought we did - but I'll double check . Or maybe the easy-nmos container - doesn't have this fix yet

rhastie commented 9 months ago

easy-nmos uses the latest rev of the nmos-cpp container from my repo. The latest rev was built on 8/3/23 so it should have this fix included.

garethsb commented 9 months ago

I don't think so, Rich.

build-nmos-cpp master branch is using a commit from Dec 2022 correspondingto the most recent Conan release. The dev branch is more recent but 17f1b8b is still older than the commit that fixed this.

https://github.com/rhastie/build-nmos-cpp/blob/1508a8c5610d3bd9bf15f3828474560c7dd40321/Dockerfile#L30

rhastie commented 9 months ago

Fair 👍 ... @macphersonjamie Could ytou try changing the image tag in the easy-nmos docker-compose.yml. There are two instances of the following line. One for the Registry and one for the Node. Please change both.

image: rhastie/nmos-cpp:latest to --> image: rhastie/nmos-cpp:dev-17f1b8b

If you can retest to see if this fixes the issue - Thanks

garethsb commented 9 months ago

My point is dev-17f1b8b is also not going to have the relevant fix. Let's sync on making a new container build.

rhastie commented 9 months ago

Ok... Sorry my lack of understanding here.

@macphersonjamie I've manually built a new container using the tip of the sony/nmos-cpp master repo. Please try the following:

image: rhastie/nmos-cpp:testbuild

This is based on "jammy" ubuntu base but is working for me and you shouldn't see any issues... Please let us know how you get on. If you confirm its functioning ok and fixes the bug I'll look to run a full update next week

macphersonjamie commented 9 months ago

@rhastie - Hi I just gave the new container a try - many thanks it fixes the issue 👍

rhastie commented 9 months ago

Let me try and finish my "jammy" testing work and then hopefully I can merge to master later next week. "latest" will be updated at that point.

In the interim, I have pushed the work to my "dev" branch. The GitHub Action has completed and it's passing the tests. The GitHub Action on dev automatically produces a dev tag which is x86 only. Feel free to use either the tags "testbuild" (multiarch) or "dev-b5c8d2e" (x86 only). Please note, I will likely remove these tags once I've merged to master.

garethsb commented 9 months ago

Also note that sony/nmos-cpp uses a feature-branching strategy and does not have formal releases on master, therefore tip of master is not always a great release candidate. I've previously made Conan Center Index releases from particular commits, but haven't done one recently... I expect to roll up a new CCI release in early November.

JamieLuo commented 8 months ago

Hi,

I also got this problem on my container, like restart the container would got the same log

2023-10-05 10:10:38.381: info: 139626509403904: asio listen error: system:98 (Address already in use).

My container was using the master commit 841212e29319352d95b886f09849268b6aac31b7 which already involved the https://github.com/sony/nmos-cpp/issues/320. Do you know what's going on?

garethsb commented 8 months ago

Hmm, no, sorry. The only other reason I've noticed for "already in use" is that the port is really in use by a different service!

macphersonjamie commented 6 months ago

Hi @rhastie

I just wanted to follow up and check which of the easy-nmos releases would be the latest and best to use with the above fix in ?

I can see from https://hub.docker.com/r/rhastie/nmos-cpp/tags

that dev-85d5443 appears to be the latest from a few days ago ?

I was looking for a master tag - but - it looks like the last master is quite old now (master-0fb6b51 ?)

best regards

Jamie

rhastie commented 6 months ago

Please use tag "dev-85d5443". This should support what you need and is the latest build from the "dev" branch of build-nmos-cpp.

For build-nmos-cpp dev branch the GitHub Actions CI/CD creates a new dev image each time I make a commit. Principally, the dev branch images remain on Docker Hub until we merge to the master branch. At that point there will be a new master and latest image created. We will delete the historical dev image tags once that's completed.

Currently, I'm finalizing the move to Jammy in the container and also testing IS-10 support etc. This is why the "latest" tag has not been updated for some time.