kazu-yamamoto / dns

DNS libary in Haskell
BSD 3-Clause "New" or "Revised" License
64 stars 36 forks source link

Releasing v4.0.0 #122

Closed kazu-yamamoto closed 5 years ago

kazu-yamamoto commented 5 years ago

I would like to release v3.1.0 in the next week.

@vdukhovni Any show-stoppers?

vdukhovni commented 5 years ago

No show-stoppers on my end. I do think that the FlagOp feature worked out nicely in the end, despite the circuitous route it took to get there. Thanks for all the assists! This is IMHO now cleaner than the ad hoc lookupRawAD and also the flags are configurable at the resolver level, so that if one can chooses an authoritative server (for bulk lookups against a zone), the RD flag can be set just once.

I'll have to remember to add associativity tests for any other semigroups I might define in the future.

Full steam ahead.

vdukhovni commented 5 years ago

Should we wait a bit more time "for the dust to settle", to see whether there's anything else we want for 4.0.0, or is it time to consider releasing?

No further architectural changes up my sleeve at this time. We could add a cache to track nameservers that are down (if many are defined) or don't do EDNS (to avoid repeatedly making EDNS queries and fail back), but this does not change the API, just adds some internal caching.

So I'm back to release when you feel ready...

vdukhovni commented 5 years ago

Actually, I do have another question. Do we still support (in the upcoming DNS 4.0.0) GHC versions that don't have fully-featured PatternSynonyms? If not, we could remove some code duplication for the various types holding names of defined constants.

Another question is whether there's a clean way to associate numbers with the constructors of sum types, other than large block of pattern matches. Nothing especially better comes to mind: We're likely stuck with code like the below:

odataToOptCode :: OData -> OptCode
odataToOptCode OD_NSID {}            = NSID
odataToOptCode OD_DAU {}             = DAU
odataToOptCode OD_DHU {}             = DHU
odataToOptCode OD_N3U {}             = N3U
odataToOptCode OD_ClientSubnet {}    = ClientSubnet
odataToOptCode OD_ECSgeneric {}      = ClientSubnet
odataToOptCode (UnknownOData code _) = toOptCode code
kazu-yamamoto commented 5 years ago

Actually, I do have another question. Do we still support (in the upcoming DNS 4.0.0) GHC versions that don't have fully-featured PatternSynonyms? If not, we could remove some code duplication for the various types holding names of defined constants.

I think that we should register an issue, keep it for one more year and then remove it.

kazu-yamamoto commented 5 years ago

Another question is whether there's a clean way to associate numbers with the constructors of sum types, other than large block of pattern matches. Nothing especially better comes to mind: We're likely stuck with code like the below:

I have no idea.

kazu-yamamoto commented 5 years ago

Should we wait a bit more time "for the dust to settle", to see whether there's anything else we want for 4.0.0, or is it time to consider releasing?

Let's wait for one more week.

vdukhovni commented 5 years ago

While we were waiting, I thought I'd try to see how close our API was to being feature complete, by implementing support for the RRSIG RData type (not validation, just encoding, decoding, and a Show instance).

This leads to an obstacle. The RRSIG data-type contains DNS 'circle-time' timestamps, and can only be converted to a string (Show instance) if we know the time at which the RRSIG was obtained. But show is a pure function, and so is decode, only resolve runs in the IO Monad.

Now for RRSIG, the current time is only used to establish the intervals of ~136 years (2^32 seconds) in which each of the 32-bit inception and expiration timestamps are to be found. The current time can change by decades without affecting the result unless the timestamps are decades in the past or future. But we still need a rough idea of the current time, otherwise the code misbehaves eventually.

We could make the current pure decode choose a base time of 2073-03-15T00:00:00+0000 and cover all times from slightly before the publication of RFC4034 until 2141-04-03. Perhaps DNSSEC will be replaced by then?

Otherwise, we need to figure out how to get some relevant initial state into decode, so that at least when decode is decoding messages from the network it knows the current time.

A decodeAtTime variant that injects another bit of state into SGet? And have resolve use that? Or post-process the output of decode in resolve to update the RRSIG resource records in the message (a bit ugly), or just assume that the code will no longer be relevant some time before April 2141...

Any thoughts?

kazu-yamamoto commented 5 years ago

A decodeAtTime variant that injects another bit of state into SGet?

I think that this approach should be the first choice.

kazu-yamamoto commented 5 years ago

BTW. it's time to change time library from time to hourglass, if possible. hourglass is much more efficient than time.

vdukhovni commented 5 years ago

Looking forward to using Hourglass. It does look saner. In the mean-time, I got RRSIG encoding, decoding and Show basically working, by adding a Word64 time to PState and using a new decodeAt interface in receive and receiveVC. This could still use some polish, but well enough to output:

dukhovni.org. IN RRSIG MX 13 2 3600 20181028225542 20181014221858 11497 dukhovni.org. /JHj5xRl18wy7QiJ+Sysaba0Kbrf1HfTYvBHYaVkwkuodRuYi8nD5iWFRFjB+gYrpjz91pzASTE+jUzbeJ61/Q==

When I set the DO bit on a query for my own MX record.

Because "time" looked so cumbersome, I bypassed it entirely for converting epoch seconds to a formatted YYYYmmddHHMMSS string. Instead I had used:

-- | Convert epoch time to a YYYYmmddHHMMSS string:
-- <http://howardhinnant.github.io/date_algorithms.html>
--
-- This avoids all the pain of converting epoch time to NominalDiffTime ->
-- UTCTime -> LocalTime then using formatTime with defaultTimeLocale!
--
showTime :: Int64 -> String
showTime t =
    let (z0, sec) = t `divMod` 86400
        z = z0 + 719468
        (era, doe) = z `divMod` 146097
        yoe = (doe - doe`quot`1460 + doe`quot`36524 - doe`quot`146096)`quot`365
        y = yoe + era * 400
        doy = doe - (365*yoe + yoe`quot`4 - yoe`quot`100)
        mp = (5*doy + 2)`quot`153
        d = doy - (153*mp+2)`quot`5 + 1
        m = 1 + (mp + 2)`mod`12
        y' = y + (12-m)`div`10
        (hh, (mm, ss)) = flip divMod 60 <$> sec `divMod` 3600
     in printf "%04d%02d%02d%02d%02d%02d" y' m d hh mm ss

With hourglass, I can let the library do the work.

vdukhovni commented 5 years ago

So back to the question of making a release. Perhaps we're closer now, but the last major issue on my list is encoding and decoding of exotic ASCII characters in labels (dots, NULs, escape sequences, ...), and perhaps (separately) IDNA canonicalization. IIRC IDNA U-labels are not allowed to have non-LDH ascii characters, but if these can mix with UTF-8, we'll handle that too.

On the wire domains are always A-labels and the special non-printable ASCII characters are verbatim binary content in each length-encoded label. So both IDNA and \ddd decimal escapes are features of decoded "presentation-form" domains. The idea is then to generate appropriate presentation forms for the more exotic labels, and to be able to encode them back to their original form.

The IDNA issue is about canonicalization (U-label -> A-label mapping), while the non-printables, dots, spaces, ... are about encoding and decoding of stored A-labels.

So I think that IDNA2008 goes into the Util module as another normalization feature, while \ddd escapes go into the decode and encode modules to support non-LDH labels.

The special case to keep in mind is the SOA mailbox, where dots are not special in the first label.

This could be a separate post-4.0.0 issue. Or it could be done now, while we're still making major changes. Do you have any cycles to look at this? I should probably get back to some neglected TLS bitrot in Postfix, to add SNI support and handle changes in OpenSSL for TLS 1.3, ...

vdukhovni commented 5 years ago

One more thing. I now see that "text-format" is not in stackage LTS post 11.x, the 12.x series don't have it yet. Though 0.3.2 on hackage is compatible. For my stack project I had to add a "extra-deps" entry for "text-format-03.2". If there's a better type-safe way to format 6 numbers into the equivalent of strftime "%Y%m%d%H%M%S", then that'd be fine. Or perhaps this is a reason to drop my magic showTime and go with Hourglass.

vdukhovni commented 5 years ago

I went ahead and switched to Hourglass. One less thing to worry about.

kazu-yamamoto commented 5 years ago

No rush. I can wait. And it's up to you when we release the next major version.

kazu-yamamoto commented 5 years ago

Did you check unix-time for formatting?

kazu-yamamoto commented 5 years ago

In my opinion, IDNA canonicalization is string conversion and should be implemented in another package which does not require the network package.

It is ok that dns depends on IDNA package.

vdukhovni commented 5 years ago

Yes, in fact an IDNA package already exists for Haskell, but it has been abandoned by its author, and I've been using a private fork. If we take this route, I could ask to adopt the package. It can still be separate, but what I'm thinking for Network.DNS is to make use of such a package to do U-label to A-label canonicalization.

kazu-yamamoto commented 5 years ago

If you want to control IDNA-related packages fully, I can contribute code for punycode. I impelemented it many years ago: https://github.com/kazu-yamamoto/Mew/blob/master/mew-bq.el#L984

vdukhovni commented 5 years ago

The punycode part is fine, the trickier bit is the Unicode property tables to support validation and normalization. So presently I depend ultimately on libicu via:

- punycode-2.0
- stringprep-1.0.0

And stringprep uses http://hackage.haskell.org/package/text-icu

dpwiz commented 5 years ago

Is there anything big blocking the 4.0?

vdukhovni commented 5 years ago

@wiz Is there anything big blocking the 4.0?

Not really. Just a wishlist of things I'd like to do, but haven't found time for.

It could also be useful in error cases to retain the wire form of messages, but the lookup functions presently can only return the fully decoded form or else an error. The DNSMessage has no slot for the raw data.

The reason I mention all of this, is that when changing the interface, one should try to anticipate future changes that might change the interface again, and do as many as are needed at once, so that the interface does not keep changing frequently...

kazu-yamamoto commented 5 years ago

v4.0.0 has been released!