keepsimple1 / mdns-sd

Rust library for mDNS based Service Discovery
Apache License 2.0
96 stars 37 forks source link

Utility function to resolve mDNS hostnames #191

Closed oysteintveit-nordicsemi closed 5 months ago

oysteintveit-nordicsemi commented 6 months ago

Hello, this is a feature request, as well as a question about the project scope.

We're using this package for making a remoting cli tool, and we'd like to persist a mDNS hostname, and re-resolve it on invocation. Since this project has its own stack for all mDNS operations, the code to do this seems to already be there (Command::Resolve)? It would be nice if some of this functionality could be exposed to the user, either as utility functions on the ServiceDaemon or by making the Zeroconf impl into a public lower level interface.

I had some worries while making this issue about the scope of the project: the readme seems to indicate that the focus is only set on service discovery. Publicizing some of the "mDNS-only" functionality may be a slippery slope towards being an mDNS library with the service discovery parts on top. But maybe that's not the case. Or maybe that's even okay with you. What do you think?

I'd be happy to create a PR with ServiceDaemon::resolve_service_host(&self, service_instance_fullname: String, try_count: u16) if I get the green light.

keepsimple1 commented 6 months ago

This feature request is in the project's scope. And in fact it reminds me of one of my previous comments for another pending issue. I was trying to add a very similar API.

There are a couple of things I think we need to figure out:
1) the name of the new method (verify or resolve_service_host or something else ?) 2) How would try_count work? (and is it needed?)

So yes, please go ahead and create the PR. Thanks for working on this!

oysteintveit-nordicsemi commented 6 months ago

Thank you for the quick response. I had a shot at this today, and I think I've gotten a better understanding of how the library works, and what needs to be done now, clearing up some initial misconceptions.

Just to get it out of the way, I don't think try_count is needed - I picked it up from Command::resolve, but I think it can continue to be an internal detail, since ServiceDaemon handles retransmissions autonomically.

Optimally, I'd like to have a function similar to ServiceDaemon::browse, that returns a Receiver<HostnameResolutionEvent>, which continuously provides data from A and AAAA record responses. I initially thought I could just put this as a new enum variant of ServiceEvent, so it would be possible to reuse Zeroconf::add_querier, Zeroconf::send_query and Zeroconf::query_cache with some modifications to allow for both functionalities. Maybe index the hostname queriers by their hostname in Zeroconf.queriers and service queriers by their ty domains. But the name ServiceEvent is not an accurate description for this functionality. And because it is a public struct, it seems like this would be a breaking change, requiring a bunch of modifications downstream.

Do you approve of solving it by adding a new struct HostnameResolutionEvent, and have duplicates of the relevant functions, so that everything is handled separately? E.g. Zeroconf::add_querier now becomes Zeroconf::add_service_querier and Zeroconf::add_hostname_querier. Or do you have a better solution in mind?

keepsimple1 commented 6 months ago

Optimally, I'd like to have a function similar to ServiceDaemon::browse, that returns a Receiver<HostnameResolutionEvent>, which continuously provides data from A and AAAA record responses.

Is this your primary use case, i.e. monitoring the A and AAAA record updates for a particular hostname in mDNS?

First, just a background, currently this lib already has some limited functionality of updating address records: when a new A or AAAA record is received, the service instance will be updated and a new ServiceResolved event is sent. (To remove an old address record, setting TTL to 0). This is the case of server (i.e. responder) actively sending out new records for addresses.

Second, now my understanding is, this new feature is to let a client (querier) to actively check if any A or AAAA record updated. Suppose we use verify as the method name, could we have an API like this?

impl ServiceDaemon {
  pub verify(&self, record_name: &str, record_type: RecordType) -> Result<()>;
}

Note it's one-shot call without returning a channel receiver. Not continuous monitoring by the daemon. I think this a main difference from your suggestion.

And RecordType is a new enum:

pub enum RecordType {
  TYPE_A,
  TYPE_AAAA,
  <etc>,
}

The base rationale is this RFC section: https://datatracker.ietf.org/doc/html/rfc6762#section-10.4 .

 When the cache receives this hint that it should reconfirm some
   record, it MUST issue two or more queries for the resource record in
   dispute.  If no response is received within ten seconds, then, even
   though its TTL may indicate that it is not yet due to expire, that
   record SHOULD be promptly flushed from the cache.

in your use case, you can call: daemon.verify(hostname, TYPE_A | TYPE_AAAA)?;

The daemon will send out 2 queries for hostname TYPE_A and TYPE_AAAA records, with a timeout 10 seconds. If successful, new TYPE_A and TYPE_AAAA records are received and a new ServiceResolve event is triggered. If not successful, all addr records of hostname will be removed from the cache and ServiceRemoved event is triggered.

Honestly I don't know if it's a good idea to use existing ServiceResolve event or adding a new event. Like you said, if add a new event, then it will be a breaking change for ServiceEvent unless we define another enum. That said, if we found that the best place for the new event is still in ServiceEvent, I am okay to take a hit for the breaking change (and add #[non_exhaustive] to it from now on).

And all the above is brain-storming, let me know what you think?

EDIT: I realize that if we define another enum for the new event (i.e. not to reuse ServiceResolved), then verify() method should return Result<Receiver<NewEventEnum>>. If we go that route, I think maybe we could choose a name broader than HostnameResolutionEvent, for example ResolutionEvent or VerifyEvent, where HostnameResolution is one variant. And I haven't got chance to check changes to Zeroconf.queriers part, will do.

EDIT2: I guess your use case is similar to what avahi-resolve-host-name command does in Avahi library? If so, I think avahi-resolve-host-name is also an one-shot command with a timeout, but not doing continuous monitoring for the caller.

TL;DR: my main question is: can this new API be a one-shot call (with a timeout of resolving), but no added continuous monitoring on the daemon side? (regardless returns a new Event receiver or not)

keepsimple1 commented 6 months ago

Do you approve of solving it by adding a new struct HostnameResolutionEvent, and have duplicates of the relevant functions, so that everything is handled separately? E.g. Zeroconf::add_querier now becomes Zeroconf::add_service_querier and Zeroconf::add_hostname_querier. Or do you have a better solution in mind?

Regarding this part, as touched upon in comment above, I would suggest to have a more generic enum, say ResolutionEvent (also ResolveEvent is ok for me). This enum has HostnameResolved (or AddressResolved ? ) as a variant. And Timeout can be another variant in this enum.

Then the new Zeroconf::add_hostname_querier will also be more generic, say Zeroconf::add_resolve_querier. (I am okay to have such new methods internally). And to be consistent, the new API could just be named resolve (instead of verify I proposed earlier).

Also note, when a new address is resolved for a hostname, not only this new ResolutionEvent will be sent to the caller, a ServiceEvent::ServiceResolved will also be sent to existing receivers of browse() call (if they are not dropped yet).

oysteintveit-nordicsemi commented 5 months ago

Is this your primary use case, i.e. monitoring the A and AAAA record updates for a particular hostname in mDNS? [...] Second, now my understanding is, this new feature is to let a client (querier) to actively check if any A or AAAA record updated. Suppose we use verify as the method name, could we have an API like this?

impl ServiceDaemon {
  pub verify(&self, record_name: &str, record_type: RecordType) -> Result<()>;
}

[...] my main question is: can this new API be a one-shot call (with a timeout of resolving), but no added continuous monitoring on the daemon side? (regardless returns a new Event receiver or not)

Sorry for the confusion. For my own use case, I only need to be able to do a one-shot lookup of an address like mymachine.local. The proposed verify API would work just fine.

However, the solution I had in mind was to create a fully compliant Multicast DNS querier (RFC 6762, sec. 5), in case anyone needs that functionality as well. I feel like it would be a shame not to do so, considering there's a whole architecture here built around continuous queries with the worker thread and mpsc channels and everything. I'll just shape my usage of it to be oneshot, using something like this:

let mdns = ServiceDaemon::new()?
let receiver = mdns.resolve("mymachine.local")?;
tokio::select! {
  _ = tokio::time::sleep(Duration::from_millis(100)) => mdns.shutdown()?,
  _ = async move {
    loop {
      match receiver.recv_async().await {
        Ok(ResolutionEvent::HostnameResolved(hostname)) => {
          println!("{:#?}", hostname);
        }
        _ => { }
      }
    }
  } => { },
};

Honestly I don't know if it's a good idea to use existing ServiceResolve event or adding a new event. Like you said, if add a new event, then it will be a breaking change for ServiceEvent unless we define another enum. That said, if we found that the best place for the new event is still in ServiceEvent, I am okay to take a hit for the breaking change (and add #[non_exhaustive] to it from now on).

I've been thinking a bit more about this. The original idea was to create a more general Event type that encompasses both browsing services and resolving hostnames. Some enum variants will only be replied by browse, some only by resolve, some by both. Then it would be possible to unify parts like query_cache (uses Sender<ServiceEvent>), queriers (currently of type HashMap<String, Sender<ServiceEvent>>), and not need to have copies of all functions and data structures for the two types of queriers.

But I've changed my mind - adding a new event ResolutionEvent is probably better, both for the user and for the library. The user won't have to add boilerplate to handle events that will never occur, and we won't have to introduce utility functions to find out which queriers are querying what kind of information.

Adding #[non_exhaustive] to both kinds of events is probably still a good idea though.

Also note, when a new address is resolved for a hostname, not only this new ResolutionEvent will be sent to the caller, a ServiceEvent::ServiceResolved will also be sent to existing receivers of browse() call (if they are not dropped yet).

Noted :+1:. I'll see if we can do both this and reuse of dns cache get/set, so service queriers and hostname queriers can benefit from each others queries.


Sidenote: Should we let the ServiceDaemon bet able to register a hostname entry (i.e. have the response handler answer non-service related A/AAAA requests), just like it's possible to register a service? It's related in some sense, but it is a different feature.

keepsimple1 commented 5 months ago

However, the solution I had in mind was to create a fully compliant Multicast DNS querier (RFC 6762, sec. 5), in case anyone needs that functionality as well. I feel like it would be a shame not to do so, considering there's a whole architecture here built around continuous queries with the worker thread and mpsc channels and everything.

I agree. So far we really only support "browsing" a service type, not "resolving" a name (e.g. hostname or other types). All resolving are done implicitly & internally.

I'll just shape my usage of it to be oneshot, using something like this:

let mdns = ServiceDaemon::new()?
let receiver = mdns.resolve("mymachine.local")?;

I like this approach. Just a couple of things:

1) shall we have a timeout parameter? where timeout=0 means no timeout. Sometime a built-in timeout mechanism can help users (for example, no need to use tokio::select! in your example, and even if the user doesn't have timers).

2) shall we have a query_type (or record_type) parameter? It will make it clear what the string param refers to.

But I've changed my mind - adding a new event ResolutionEvent is probably better, both for the user and for the library. The user won't have to add boilerplate to handle events that will never occur, and we won't have to introduce utility functions to find out which queriers are querying what kind of information.

👍

Sidenote: Should we let the ServiceDaemon bet able to register a hostname entry (i.e. have the response handler answer non-service related A/AAAA requests), just like it's possible to register a service?

Although currently we don't have a public API to register hostname only, but any registered service will respond to hostname (or other types) query: https://github.com/keepsimple1/mdns-sd/blob/c141cb370f373e21de0bb672753996beba85de4b/src/service_daemon.rs#L1768

At the moment, I felt it meets the need of a responder for resolving hostname. If it turns out we need more, we can look into it again.

oysteintveit-nordicsemi commented 5 months ago

shall we have a timeout parameter? where timeout=0 means no timeout. Sometime a built-in timeout mechanism can help users (for example, no need to use tokio::select! in your example, and even if the user doesn't have timers).

I was a bit unsure of how this fits into continuous queries, so I have ignored it for now. Feel free to point it out in the PR if you'd still like to add it.

shall we have a query_type (or record_type) parameter? It will make it clear what the string param refers to.

I think it might be better if we just query for both A and AAAA records, and rather make the usage clear through function and argument names, as well as comments and examples. I don't see a usecase for asking only for one (and the mDNS protocol even encourages answering with both types of addresses even if you only query for one (RFC6762, section 6.2, par. 2))