mirage / ocaml-dns

OCaml implementation of the DNS protocol
BSD 2-Clause "Simplified" License
105 stars 43 forks source link

Add the LOC record #310

Closed RyanGibb closed 2 years ago

RyanGibb commented 2 years ago

See rfc1876

hannesm commented 2 years ago

Thanks for your PR, would you mind to add some tests into tests/test.mlcovering both successful packet decoding and unsuccessful packet decoding? Also for encoding the packet... Take a brief look at that test module, it should be pretty straightforward to add that.

About the zone file parsing, there aren't many tests yet -- maybe add some to server.ml.

RyanGibb commented 2 years ago

Yes, sure!

RyanGibb commented 2 years ago

Hi @hannesm, that's tests covering successful packet decoding, unsuccessful packet decoding, and packet encoding added.

RyanGibb commented 2 years ago

And some parsing tests.

RyanGibb commented 2 years ago

Currently the altitude maximum tests are failing. This is due to Int32.of_float converting anything > 2**31 to -2147483648 (i.e. 2^31 as a signed number). CStruct.BE.set_uint32 uses int32 so this is related to https://github.com/mirage/ocaml-cstruct/issues/86.

There's also some funny business happening on debian-11-4.14 arm32, arm64, ppc64, s390x, and x86_32.

RyanGibb commented 2 years ago

I've solved the encoded altitude > 2**31 with a horrible hack that uses an int64 instead and writes to the buffer at an offset of 4 bytes with

Cstruct.BE.set_uint64 buf (off + 8) loc.alt;

As opposed to

Cstruct.BE.set_uint32 buf (off + 12) loc.alt;

Which relies on the later statement

Cstruct.BE.set_uint32 buf (off + 8) loc.long;

To correctly set the 4 bytes at offset 8 (which will always be 0 from loc.alt as it's at most 32 bits).

This requires some clearing and resetting these bits when reading the alt value:

    (* Horrible hack to deal with encoded altitude > 2**31,
       as Int32.to_float converts anything > 2**31 to -2**31 *)
    let alt = Cstruct.BE.set_uint32 buf (off + 8) 0l;
      Cstruct.BE.get_uint64 buf (off + 8) ;
    in
    Cstruct.BE.set_uint32 buf (off + 8) long;

I'm very open to hearing any better ways of doing this.

RyanGibb commented 2 years ago

Finally, fixed arm32 and x86_32, by removing intermediary cast to int between int64 and float in alt_print.

hannesm commented 2 years ago

I've solved the encoded altitude > 2**31 with a horrible hack that uses an int64 instead and writes to the buffer at an offset of 4 bytes with

Cstruct.BE.set_uint64 buf (off + 8) loc.alt;

I'm not sure I understand why this is necessary. Doesn't an int32 work in terms of encoding and decoing -- but the interpretation should be adjusted (e.g. in the altitude)?

Why should all this deal with float -- AFAICT the RFC uses various integers (i.e. only represents a few coordinates with a given precision). From my perspective, an API that takes only integer values (int32 etc.) would be sufficient -- certainly some "geo-coordinate library" could deal with float conversions from geocoordinates to DNS RRs.

Another comment is the "Loc set" -- the RFC is not very specific here, but does it make sense for any domain name to have multiple LOC records? If the answer is no, let's (similar to SOA and CNAME) use only a single Loc.t (not a set). This makes the handling and API also much simpler.

hannesm commented 2 years ago

I just re-read Appendix A of RFC 1876 - and this is all about string to binary conversion, but avoids any floating point values. So I suggest to have a "decode/parse" function taking a string and returning the int32 value (and the precision, or what is needed). Does this make sense?

The whole DNS library is so far free of floating point values, and I appreciate to keep it that way.

RyanGibb commented 2 years ago

I've solved the encoded altitude > 2**31 with a horrible hack that uses an int64 instead and writes to the buffer at an offset of 4 bytes with Cstruct.BE.set_uint64 buf (off + 8) loc.alt;

I'm not sure I understand why this is necessary. Doesn't an int32 work in terms of encoding and decoing -- but the interpretation should be adjusted (e.g. in the altitude)?

Indeed, you are correct. The parsing and printing is where the issue arises. I've added a change that uses ocaml-integers to parse and print the variable. This is also a bit hacky, as there is no Unsigned.UInt32.to_float. Instead, I've added a chain of casting from float to Int64 to Unsigned.UInt32 to int32 for parsing, and vice versa for printing. This is in order to avoid the problematic Int32.of_float and Int32.to_float. Do let me know if there's a better way to do this.

Why should all this deal with float -- AFAICT the RFC uses various integers (i.e. only represents a few coordinates with a given precision). From my perspective, an API that takes only integer values (int32 etc.) would be sufficient -- certainly some "geo-coordinate library" could deal with float conversions from geocoordinates to DNS RRs. ... I just re-read Appendix A of RFC 1876 - and this is all about string to binary conversion, but avoids any floating point values. So I suggest to have a "decode/parse" function taking a string and returning the int32 value (and the precision, or what is needed). Does this make sense?

The whole DNS library is so far free of floating point values, and I appreciate to keep it that way.

The floats arise from the parsing and printing. It would certainly be possible to adapt these to read fixed precision integers, but I'm not sure what the benefit of this additional complexity would be. Can I ask what your reason for wanting to avoid floats is?

I'm not sure I completely understand the idea behind the geocoordinate library; the representation here is quite closely coupled to the LOC record.

Another comment is the "Loc set" -- the RFC is not very specific here, but does it make sense for any domain name to have multiple LOC records? If the answer is no, let's (similar to SOA and CNAME) use only a single Loc.t (not a set). This makes the handling and API also much simpler.

There is not a clear use case for this record. The RFC discusses a number of possibilities, but it's quite open ended. For this reason I think providing the greatest flexibility would be best. And I can certainly imagine a scenario where someone would want to associate multiple LOC records with one domain. Also, other DNS server implementations (notably bind) support multiple LOC records for one domain:

$ dig loc gibbr.org +short
51 53 23.990 N 1 28 34.190 E 0.00m 10m 1m 1m
52 12 40.000 N 0 5 31.000 W 22.00m 10m 10m 10m
hannesm commented 2 years ago

Indeed, you are correct. The parsing and printing is where the issue arises. I've added a change that uses ocaml-integers to parse and print the variable.

Great, but sorry to tell you: I do not like to add dependencies to dns . Could you retrieve the required bits from the integers library and embed them into the Loc module?

The floats arise from the parsing and printing. It would certainly be possible to adapt these to read fixed precision integers, but I'm not sure what the benefit of this additional complexity would be. Can I ask what your reason for wanting to avoid floats is?

Floats are at some points problematic: being imprecise, not being used on the wire, various issues if you're using them in kernel mode (as running as MirageOS unikernel eludes to). But I see that your parse function is the only one using floats, so for LOC records transmitted on the wire or in a zone file that code wouldn't be exercised - please correct me if I'm wrong.

I'm not sure I completely understand the idea behind the geocoordinate library; the representation here is quite closely coupled to the LOC record.

I was imaging that a potential use would be that there's a GPS receiver somewhere used to fill in the geocoordinates -- or someone looking up on a map the geocoordinate and wanting to insert them as floating point number.

And I can certainly imagine a scenario where someone would want to associate multiple LOC records with one domain. Also, other DNS server implementations (notably bind) support multiple LOC records for one domain:

Ok, fine with me. I was only wondering what the interpretation may be.

RyanGibb commented 2 years ago

Great, but sorry to tell you: I do not like to add dependencies to dns . Could you retrieve the required bits from the integers library and embed them into the Loc module?

No problem, can do! This required pulling a fair bit of C in, so instead I just manually converted the unsigned int32 to an int64. The Int64.to_int32 statement is fine, but the Int64.of_int32 is where the trouble arises.

Thanks for the help with this, and apologies for the back and forth. This is my first time writing OCaml so I'm still getting to grips with the conventions.

Floats are at some points problematic: being imprecise, not being used on the wire, various issues if you're using them in kernel mode (as running as MirageOS unikernel eludes to). But I see that your parse function is the only one using floats, so for LOC records transmitted on the wire or in a zone file that code wouldn't be exercised - please correct me if I'm wrong.

The floats are used for parsing and printing. For LOC records transmitted on the wire the code wouldn't be exercised. It would be exercised when reading the record from the file, or printing the record. Is this ok?

The printing would be relatively easy to change, the parsing a bit trickier.

And I can certainly imagine a scenario where someone would want to associate multiple LOC records with one domain. Also, other DNS server implementations (notably bind) support multiple LOC records for one domain:

Ok, fine with me. I was only wondering what the interpretation may be.

Cool :-) I guess one interpretation could be a multihomed network. At a higher level, maybe a home and work address.

RyanGibb commented 2 years ago

Okay, that's all floats removed.

RyanGibb commented 2 years ago

By the way, is there a code formatter for the project?

avsm commented 2 years ago

@hannesm:

Great, but sorry to tell you: I do not like to add dependencies to dns

Usually a good rule of thumb, but ocaml-integers is a pretty foundational dependency (e.g. ctypes also uses it). It's quite fiddly getting that handling correct, and I'm not in favour of duplicating that code. There must be plenty of other places outside of Loc where we could potentially use signed/unsigned parsing to OCaml integers.

(I've wanted a ctypes<->cstruct bridge for years now, since ctypes interprets different integer/floats very well, and cstruct parses them from bytes very efficiently)

hannesm commented 2 years ago

@avsm

Usually a good rule of thumb, but ocaml-integers is a pretty foundational dependency (e.g. ctypes also uses it). It's quite fiddly getting that handling correct, and I'm not in favour of duplicating that code. There must be plenty of other places outside of Loc where we could potentially use signed/unsigned parsing to OCaml integers.

And does ocaml-integers work in a MirageOS 3 and MirageOS 4 environment? I figured it uses C code, and am not able to find any (MirageOS 3 required) cross-compilation runes.

avsm commented 2 years ago

It does use a small C binding. But at some point we have to stop supporting Mirage3 and just move our library base forward (the older releases of DNS will continue to work with the older Mirage3 of course). Perhaps this is a general point of discussion at the next Mirage community meeting to see how long we need to support Mirage3 moving forwards.

There is also the question of jsoo as well — it looks like integers is supported on that via https://ocaml.org/p/integers_stubs_js/1.0

On 16 May 2022, at 11:37, Hannes Mehnert @.***> wrote:

 @avsm

Usually a good rule of thumb, but ocaml-integers is a pretty foundational dependency (e.g. ctypes also uses it). It's quite fiddly getting that handling correct, and I'm not in favour of duplicating that code. There must be plenty of other places outside of Loc where we could potentially use signed/unsigned parsing to OCaml integers.

And does ocaml-integers work in a MirageOS 3 and MirageOS 4 environment? I figured it uses C code, and am not able to find any (MirageOS 3 required) cross-compilation runes.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

RyanGibb commented 2 years ago

and I'm not in favour of duplicating that code

In this specific case, the duplication is relatively light:

    (* convert a uint32 alt to an int64 *)
    let alt = if alt < 0l then
        Int64.of_int32 alt + Int64.shift_left 1L 32
      else Int64.of_int32 alt
    in

Is there anything else I can improve to get an approval @hannesm ?

hannesm commented 2 years ago

Dear @RyanGibb, I guess all you need is a bit patience. I spent some hours in the review of this PR, and since haven't found time to review your most current changes.

RyanGibb commented 2 years ago

No problem, and no rush. I'll wait to hear back from you.

RyanGibb commented 2 years ago

Hi @hannesm, apologies to bother you, but I thought I'd give this a gentle bump. It would be great if you would have time to review the most recent changes. However, if you don't have the time, I completely understand.

Thanks for all your help this far.

hannesm commented 2 years ago

Thanks @RyanGibb, I'll take a look later this week. I am just returning from vacation. Thanks for your additional work on it :)

RyanGibb commented 2 years ago

Thanks for the comments @hannesm! Apologies for the erroneous files. The .envrc and shell.nix were for making opam work with nix nicely in my local environment as outlined here.

hannesm commented 2 years ago

thank you for your contribution. I added minor comments and suggestions above. IMHO this is mostly ready for being merged.

RyanGibb commented 2 years ago

Thank you for the suggestions, and for all the help getting this here!

hannesm commented 2 years ago

sorry, in #314 I decided it doesn't make much sense to define int_compare / int32_compare in dns.ml, and got rid of it.

hannesm commented 2 years ago

CI is green, I squash-merged.