apple / swift-async-dns-resolver

A Swift library for asynchronous DNS requests, wrapping c-ares with Swift-friendly APIs and data structures.
Apache License 2.0
82 stars 12 forks source link

Make DNSSD queries cancellable #32

Open victorherrerod opened 3 months ago

victorherrerod commented 3 months ago

Hi! First of all, thanks for this great lib!

I'd like to be able to cancel queries made with the DNSSD Resolver, because the timeout is pretty long and it would be nice to be able to cancel the Task after a custom timeout. I've seen that the cancellation behavior is implemented for the Ares Resolver and I've tried to make it work for the DNSSD Resolver. I implemented the onTermination method of the stream continuation, however this produces a warning when trying to call DNSServiceRefDeallocate on the pointer, since it isn't a Sendable type and it could be modified by several threads at the same time.

I am not very experienced in the inner workings of Swift concurrency, how could the DNSSD queries cancellation be implemented?

Thanks and have a nice day!

yim-lee commented 3 months ago

I don't see any warnings locally with https://github.com/apple/swift-async-dns-resolver/pull/33. Which Swift version are you using? Would you be able to give changes in that PR a try to see if it works for you?

victorherrerod commented 3 months ago

I'm using Swift 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4) and I get the warning Capture of 'serviceRefPtr' with non-sendable type 'UnsafeMutablePointer<DNSServiceRef?>' (aka 'UnsafeMutablePointer<Optional<OpaquePointer>>') in a ``@Sendable`` closure.

The changes in the PR don't seem to work. onTermination is called, but only after the long DNSSD timeout, and the termination is always .finished(nil), never .cancelled. Here's a minimal example of code where my 3s timeout is not respected and the SRV query still runs for 10-15s:

import Foundation
import AsyncDNSResolver

struct TimeoutError: Error {}

func getSRVRecords(withTimeout: TimeInterval) async throws -> [SRVRecord] {
    let resolver = try AsyncDNSResolver()
    let records = try await withThrowingTaskGroup(of: [SRVRecord].self) { group in
        group.addTask {
            return try await resolver.querySRV(name: "")
        }
        group.addTask {
            try await Task.sleep(nanoseconds: UInt64(withTimeout) * NSEC_PER_SEC)
            try Task.checkCancellation()
            throw TimeoutError()
        }
        do {
            let records = try await group.next()!
            group.cancelAll()
            return records
        } catch {
            group.cancelAll()
            throw error
        }
    }
    return records
}

do {
    let records = try await getSRVRecords(withTimeout: 3)
    print(records)
} catch {
    print("Error: \(error)")
}

In my example I put "" as the service name, but I believe it happens with any SRV query that has no answer. group.cancelAll() is called when the query times out, in the catch, but onTermination is not called at that time.

yim-lee commented 3 months ago

DNSServiceProcessResult is blocking. We might have to do something like this SO suggested using select:

// Source: https://stackoverflow.com/questions/22502729/resolving-srv-records-with-ios-sdk

- (void)updateDnsRecords
{
    if (self.dnsUpdatePending == YES)
    {
        return;
    }
    else
    {
        self.dnsUpdatePending = YES;
    }

    NSLog(@"DNS update");
    DNSServiceRef       sdRef;
    DNSServiceErrorType err;

    const char* host = [self.dnsHost UTF8String];
    if (host != NULL)
    {
        NSTimeInterval remainingTime = self.dnsUpdateTimeout;
        NSDate*        startTime = [NSDate date];

        err = DNSServiceQueryRecord(&sdRef, 0, 0,
                                    host,
                                    kDNSServiceType_SRV,
                                    kDNSServiceClass_IN,
                                    processDnsReply,
                                    &remainingTime);

        // This is necessary so we don't hang forever if there are no results
        int            dns_sd_fd = DNSServiceRefSockFD(sdRef);
        int            nfds      = dns_sd_fd + 1;
        fd_set         readfds;
        int            result;

        while (remainingTime > 0)
        {
            FD_ZERO(&readfds);
            FD_SET(dns_sd_fd, &readfds);

            struct timeval tv;
            tv.tv_sec  = (time_t)remainingTime;
            tv.tv_usec = (remainingTime - tv.tv_sec) * 1000000;

            result = select(nfds, &readfds, (fd_set*)NULL, (fd_set*)NULL, &tv);
            if (result == 1)
            {
                if (FD_ISSET(dns_sd_fd, &readfds))
                {
                    err = DNSServiceProcessResult(sdRef);
                    if (err != kDNSServiceErr_NoError)
                    {
                        NSLog(@"There was an error reading the DNS SRV records.");
                        break;
                    }
                }
            }
            else if (result == 0)
            {
                NBLog(@"DNS SRV select() timed out");
                break;
            }
            else
            {
                if (errno == EINTR)
                {
                    NBLog(@"DNS SRV select() interrupted, retry.");
                }
                else
                {
                    NBLog(@"DNS SRV select() returned %d errno %d %s.", result, errno, strerror(errno));
                    break;
                }
            }

            NSTimeInterval elapsed = [[NSDate date] timeIntervalSinceDate:startTime];
            remainingTime -= elapsed;
        }

        DNSServiceRefDeallocate(sdRef);
    }
}

But instead of having a timeout/deadline we check for Task.isCancelled periodically? (thinking out loud)

The existing reply handling code should do what processDnsReply does already, so we shouldn't need to worry too much about it I think.

Hopefully the code involved (e.g., DNSServiceRefSockFD) works for all the query types this library supports and not just SRV. Looks like it is, but we will definitely need to test it.

Note that the referenced SO response is from 10 years ago. Maybe there are newer/better ways to do this.

@victorherrerod Is this something you want to try looking into?

Lukasa commented 3 months ago

If you want to wake a select on cancellation the usual recommendation would be to also add an eventFD or a pipe to the fdset and write into it on cancellation, then check for that.

victorherrerod commented 3 months ago

Instead of using select I used poll to avoid being blocked, as I've read it's more efficient. I thought about using an eventFD as @Lukasa mentioned, but I didn't know which library to import for get eventFD honestly (the Linux <sys/eventfd.h>) was not found in Xcode.

I've tested this a bit and it seems to work correctly, my code snippet above behaves as expected and the task cancels after 3 seconds.

The main doubt I have is removing DNSServiceRefDeallocate(serviceRefPtr.pointee) after calling DNSServiceProcessResult(serviceRefPtr.pointee). I did this because I have noticed the onTermination callback is always executed after calling continuation.finish(), even when there is no error, so I didn't want to call it twice when the request is successful.

However, I still have the Sendable warning I mentioned previously, I don't exactly know how to fix that.

PD: I amended the commit to have the author linked to my account