update benchmarks for Document encoding

Nuhvi commented 1 year ago

I suspected json wasn't a good choice, as it requires extra dependency for clients, and add a lot of overhead (brackets).

Evidently, csv + brotli is hard to beat as I suspected.

We are very early, so I will probably break backward compatibility (not support the brotli + json encoded documents)

I am also considering if I should remove the version prefix byte, to really discourage any changes to this format in the future!

More versions == more complexity on decoder side.

If you are watching this, and think you have a better corpus of dns records to benchmark against, please let me know.

Also, as you can see from the code, I am disregarding all fields except name, type and data which is just whatever is left.

The logic here is that TTL doesn't make sense for these individual records, as they all updated together, and really Pkarr targets very long TTL, to reduce the frequency of contacting the DHT. It seems that in general people use 60 minutes as the sensible default TTL, and that is generally what Pkarr targets as well.

If you have an argument against omitting TTLs, let me know, maybe in Discussions. Regardless, since this is meant to be a csv, we can add TTLs later at the end of each row!

As for Class field, I think that field is poorly defined and rarely used, so also ignored.

Nuhvi commented 1 year ago

Hmm, just realized that I can do more optimization with custom encoding, for example encoding ipv4 and ipv6 in a more compact manner. Maybe that makes it more worthwhile!

Nuhvi commented 1 year ago

Updated the benchmark, and encoding A, and AAAA records makes a difference especially on small documents.

Will be working on updating the doucment codec, and intend to keep records limited to A, AAAA, CNAME and TXT.

The rational is A and AAAA serve any server, CNAME serves mapping to domains, and TXT covers the rest like in ENS.

pubky / pkarr

update benchmarks for Document encoding #8