Make to/from IPv4/IPv6 functions stricter.

vdukhovni commented 4 years ago

These are basically structure getters/setters and should be strict in their input and produce fully-evaluated output.

Also new {to,from}{IPv4,IPv6}w functions provide efficient inlined non-copying getters/setters for the (current) internal representations of the addresses, speeding up bulk processing. They're similar to the {to,from}HostAddress{,6} functions, but don't do byte order conversions and are strict.

Less terse documentation and a fix for a cut/paste error.

NOTE: The "to" functions now truncate each element to the expected number of bits. Previously, they'd just "+" un-truncated values leading to more surprising results when the input lists have elements with excess bits set. I don't expect this to be a compatibility issue, but given malformed input the behaviour is now different.

kazu-yamamoto commented 4 years ago

I would like to use the Strict and StrictData pragmas instead of bang patterns.

vdukhovni commented 4 years ago

I would like to use the Strict and StrictData pragmas instead of bang patterns.

I am less familiar with their uses, especially in the context of tuples. But ultimately I don't really mind how this is done, so long as we can ensure that IP6 (a,b,c,d) is strict in a, b, c and d. As for IPv4, being a newtype directly around the underlying Word32 it is already strict, since it is just a Word32 at runtime.

vdukhovni commented 4 years ago

I wrote a fast Data.ByteString.Builder for IPv6 that even outperforms (by a factor of 10) the C-library inet_ntop() on FreeBSD and Linux, despite doing memory allocation, while the C code runs with fixed buffers, and it benefits from strict(er) IPv6 constructors. It always direct access to the Word32 form of IPv4 without conversion to network byte order.

So the new stricter accessors are to help with faster serialization, producing fewer intermediate lazy thunks and less GC work.

It is ~16 times faster than using show (or 11 times faster than a more optimized implementation of show).

kazu-yamamoto commented 4 years ago

3985e6066e69d98f21d33e2427cd68c9e62cbc44 should bring strictness. Please check it out.

vdukhovni commented 4 years ago

3985e60 should bring strictness. Please check it out.

I am not sure that the entire library should be strict, for example, the ShowS instances probably need to be lazy. I also don't think that StrictData covers tuple elements, which is the main thing I wanted to see strict. I'll take a closer look, but I expect this may not turn out to be either sufficiently fine-grained, or sufficiently comprehensive where needed.

vdukhovni commented 4 years ago

3985e60 should bring strictness. Please check it out.

I am not sure that the entire library should be strict, for example, the ShowS instances probably need to be lazy. I also don't think that StrictData covers tuple elements, which is the main thing I wanted to see strict. I'll take a closer look, but I expect this may not turn out to be either sufficiently fine-grained, or sufficiently comprehensive where needed.

OK, looking more closely, it does look like much of the module could be strict, almost all the structures and functions work with small objects with no obvious opportunities for lazy evaluation. But it would take a bit of care to make sure that strictness will never be slower if any of the code.

That said, for IPv6, strictness in the tuple does not ensure strictness in its elements, so we'd still some BangPatterns, because the type is not a "data" with four fields that could be made strict, but is rather a simple type alias for a four-tuple, and StrictData does not as far as I can tell make 4-tuples strict.

So I'd probably still want to add some BangPatterns...

kazu-yamamoto commented 4 years ago

Now I think that most part of this PR should be merged. But I would like to discuss something before merging.

vdukhovni commented 4 years ago

All green now from 7.8 through 8.8.2. All that remains is a decision on whether the entire library is Strict by default (your commit), or just my selective tweaks. Perhaps you're right that Strict by default is best, I am less sure, but don't see any obvious objections. Your call.

vdukhovni commented 4 years ago

FWIW, as you know, Strict isn't always better, my IPv6 Builder code does 100million IPv6 -> Builder conversions (streaming them all to stdout) in 6.144 seconds with laziness on by default and selected strictness annotations, ... but takes 7.462 seconds to do the same thing when strict by default.

But that is perhaps because ByteString Builders rely on laziness for much of their efficiency, the functions in "iproute" other than the Show instances, may work as well or slightly better strict

I don't have a good feel for which is likely better as yet, just know that StrictData is typically a win, and so wanted to make the IPv6 4-tuple more strict, while at the same using somewhat more direct expressions for the accessors that generate fewer intermediate lists, so less GC and also less CPU.

kazu-yamamoto commented 4 years ago

But that is perhaps because ByteString Builders rely on laziness for much of their efficiency, the functions in "iproute" other than the Show instances, may work as well or slightly better strict

If we need lazyness in Strict or StrictData, we can put ~ (like !).

kazu-yamamoto commented 4 years ago

@vdukhovni Do you want to keep the four tupple or don't mind if I will change it to data IPv6Addr with StrictData?

kazu-yamamoto commented 4 years ago

I wrote some tips for Strict in Japanese. I hope that google translation will works well for you: https://kazu-yamamoto.hatenablog.jp/entry/20151117/1447726679

Also, my position for lazyness is here: https://kazu-yamamoto.hatenablog.jp/entry/2019/02/15/115630

vdukhovni commented 4 years ago

@vdukhovni Do you want to keep the four tupple or don't mind if I will change it to data IPv6Addr with StrictData?

Keep the 4-tuple.

vdukhovni commented 4 years ago

I wrote some tips for Strict in Japanese. I hope that google translation will works well for you: https://kazu-yamamoto.hatenablog.jp/entry/20151117/1447726679

Also, my position for lazyness is here: https://kazu-yamamoto.hatenablog.jp/entry/2019/02/15/115630

I've come to increasingly appreciate laziness as time goes by. It can in fact yield significant performance advantages, but yes it takes some experience to understand when to disable it.

I do like StrictData, that's a good default. As for Strict I am more hesitant to use it...

Anyway, my part of the PR is done, and it is now green from 7.8 through 8.8.2! So the ball is in your court...

vdukhovni commented 4 years ago

What is the next step with this PR? Did I neglect to do something?

kazu-yamamoto commented 4 years ago

No. I'm just busy for preparing 17th QUIC interop. I will come back this issue in the next week.

kazu-yamamoto commented 4 years ago

@vdukhovni Sorry for the delay. I have merged this PR. Do you want to get a new version released?

vdukhovni commented 4 years ago

@vdukhovni Sorry for the delay. I have merged this PR. Do you want to get a new version released?

Thanks! As for a new version, I'm not in a hurry. You could wait for any other worthwhile changes to show up, or just release after a while at a time that's convenient for you.

FWIW, I have a performance improved Show instance for IPv6 that runs 1.55 times faster, but I don't think that justifies the increased code complexity. The real payoff with that algorithm is that it supports a ByteString Builder for IPv6 and IPv4, and the ByteString builder is 13.5 times faster than using Show.

If you wanted the Builder contributed to this module, I could do that...

kazu-yamamoto commented 4 years ago

iproute does not depends on bytestring. However, since bytestring is installed everywhere, it's OK for me to add bytestring as dependency. Let's release a new version after we merge your builder.

vdukhovni commented 4 years ago

So, I take it that you are interested in providing a Builder for IPv6/IPv4 addresses? If so, I'll submit a PR.

Just a "warning", in order to make the code fast, I'm using some low-level GHC-specific primitives, in particular, some of the computations are using primitive unlifted Word32# and Int# values, and associated primitive bit operators. The primops are pretty much confined to just the code that deals with computing the best gap...

It presently outperforms the C-library inet_ntoa() on Linux and FreeBSD by a factor of 10.

kazu-yamamoto commented 4 years ago

So, I take it that you are interested in providing a Builder for IPv6/IPv4 addresses?

Yup!

kazu-yamamoto / iproute

Make to/from IPv4/IPv6 functions stricter. #45