Closed hrzlgnm closed 4 months ago
Example output of the modified query example from #214
---first receive cycle---
At 722.976µs : SearchStarted("_test._tcp.local. on addrs [192.168.122.79, 192.168.100.140, 192.168.42.9, fe80::67ff:49ff:c2b5:d079, fe80::b6ec:dfe:2f30:750f, fe80::d50c:ccac:7d50:ccb2]")
At 39.257001ms : ServiceFound("_test._tcp.local.", "thunk-void-vm._test._tcp.local.")
At 39.332082ms: Resolved a new service: thunk-void-vm._test._tcp.local.
host: thunk-void-vm-4f5eea3e-318c-4c21-a648-13ffeee510e8.local.
port: 4223
Address: 192.168.100.140
At 49.996074ms: Resolved a new service: thunk-void-vm._test._tcp.local.
host: thunk-void-vm-9e6bc890-df28-444a-8515-bf785ecf4bf6.local.
port: 4223
Address: 192.168.122.79
---second receive cycle---
At 7.003371126s : SearchStarted("_test._tcp.local. on addrs [192.168.122.79, 192.168.100.140, 192.168.42.9, fe80::67ff:49ff:c2b5:d079, fe80::b6ec:dfe:2f30:750f, fe80::d50c:ccac:7d50:ccb2]")
At 7.0033837s : ServiceFound("_test._tcp.local.", "thunk-void-vm._test._tcp.local.")
At 7.003387887s: Resolved a new service: thunk-void-vm._test._tcp.local.
host: thunk-void-vm-9e6bc890-df28-444a-8515-bf785ecf4bf6.local.
port: 4223
Address: 192.168.122.79
---
The second resolved service seems to overwrite the first one after being resolved. I've seen cases where the order was the other way around and who came second won.
But I'm also not sure whether the behavior of the publishing side is correct, according to the respective standards.
I played around with https://gitlab.com/hrzlgnm/m/-/blob/master/zerodings/resolve.py?ref_type=heads and python zeroconf also seems to update the first entry, when seeing the second one:
Service thunk-void-vm._test._tcp.local. added, service info: ServiceInfo(type='_test._tcp.local.', name='thunk-void-vm._test._tcp.local.', addresses=[b'\xc0\xa8zO'], port=4223, weight=0, priority=0, server='thunk-void-vm-9e6bc890-df28-444a-8515-bf785ecf4bf6.local.', properties={}, interface_index=None)
Service thunk-void-vm._test._tcp.local. updated ServiceInfo(type='_test._tcp.local.', name='thunk-void-vm._test._tcp.local.', addresses=[b'\xc0\xa8d\x8c'], port=4223, weight=0, priority=0, server='thunk-void-vm-4f5eea3e-318c-4c21-a648-13ffeee510e8.local.', properties={}, interface_index=None)
Using avahi-browse -rp _test._tcp
, one also can observe the same behavior, that one of the records overwrites the one in the cache:
First resolve after fresh restart of avahi-daemon:
----------------------------------------------------------
+;virbr0;IPv4;thunk-void-vm;_test._tcp;local
+;virbr1;IPv4;thunk-void-vm;_test._tcp;local
=;virbr0;IPv4;thunk-void-vm;_test._tcp;local;thunk-void-vm-f1715f24-d718-44ec-871b-abe05104215f.local;192.168.122.79;4223;
=;virbr1;IPv4;thunk-void-vm;_test._tcp;local;thunk-void-vm-f1715f24-d718-44ec-871b-abe05104215f.local;192.168.122.79;4223;
----------------------------------------------------------
From there on, I only could see:
----------------------------------------------------------
+;virbr1;IPv4;thunk-void-vm;_test._tcp;local
+;virbr0;IPv4;thunk-void-vm;_test._tcp;local
=;virbr1;IPv4;thunk-void-vm;_test._tcp;local;thunk-void-vm-52143363-5599-49f3-a6ca-c53b62e0e6e2.local;192.168.100.140;4223;
=;virbr0;IPv4;thunk-void-vm;_test._tcp;local;thunk-void-vm-52143363-5599-49f3-a6ca-c53b62e0e6e2.local;192.168.100.140;4223;
----------------------------------------------------------
I thought about the "publishing" and "redundancy" approach a bit more and came to the conclusion, that the publishing approach is perhaps wrong. It makes no sense to publish the same instance name on both network interfaces without causing collisions or cache overwrites.
When resolving those using
cargo run --example=query _test._tcp
one get both services resolved when running on the same machine.
Although it looks like both services are resolved, they are actually two events from the same cache. As ServiceInfo
struct only contains / supports one hostname
, the 2nd resolve overwrites the the first resolve. (In this sense, it is same / similar to python zeroconf or avahi)
But now comes the catch, if one restarts the browse, on the same mdns-daemon instance with the same service_type
_test._tcp
one only get one of the services resolved.
When searching again, the cache is still "hot" (not expired), so it will immediately resolve from the cache, with the 2nd instance info.
That said, internally we actually stored both records in the vec, but it will only use the first entry. And when a new record are received, it is inserted at the head of the vec, hence effectively overwrites in the ServiceInfo
.
It makes no sense to publish the same instance name on both network interfaces without causing collisions or cache overwrites.
I tend to agree. From what I saw, one instance is always mapping to one host (not multiple hosts). Not sure if there are exceptions. (And if considering load-balancing, I think most time is using multiple IP addrs for one host / instance).
Thanks for your input @keepsimple1, I'm closing the issue as it isn't one. I think it makes more sense to publish the record with multiple IP addrs.
Having the following setup
Let's say we have some sort of redundancy setup where we use two network interfaces in different subnets. And on both interfaces we also publish the same instance name, but a different hostname is used, to avoid collisions:
When resolving those using
cargo run --example=query _test._tcp
one get both services resolved when running on the same machine. But now comes the catch, if one restarts the browse, on the same mdns-daemon instance with the same service_type_test._tcp
one only get one of the services resolved.See #214 where i modified the query example to do that.