bk / Data-ULID

Perl implementation of ULID (Universally Unique Lexicographically Sortable Identifier)
7 stars 6 forks source link

Open to a re-implementation? #2

Closed melo closed 2 years ago

melo commented 6 years ago

Hi,

I re-wrote parts of the core of Data::ULID to suit my needs. I have two options:

This benchmark:

use Benchmark 'cmpthese';
use lib '.';
use ULID       ();
use Data::ULID ();

cmpthese(
  -5,
  { 'ULID text'       => sub { ULID::ulid() },
    'Data::ULID text' => sub { Data::ULID::ulid() },
  }
);

cmpthese(
  -5,
  { 'ULID binary'       => sub { ULID::ulid_bin() },
    'Data::ULID binary' => sub { Data::ULID::binary_ulid() },
  }
);

Gives this results:

                    Rate Data::ULID text       ULID text
Data::ULID text   9456/s              --            -93%
ULID text       145084/s           1434%              --
                      Rate Data::ULID binary       ULID binary
Data::ULID binary   6216/s                --              -99%
ULID binary       805650/s            12860%                --

Do you have any preference? I can do either of the options above.

Thank you,

bk commented 6 years ago

Hello Pedro,

This is a very impressive speedup indeed (and I'm sure you have made other improvements aside from speed).

Of course, I haven't seen your code yet, but given that ULID is not widely used by Perl programmers I think a separate module would be counter-productive. Out of the two options you present, I would therefore prefer the first, i.e. a pull request from you. However, a third possibility is that I transfer the module on CPAN to you. I do not have an especially strong sense of ownership towards it, so I would not mind that at all.

Whatever you decide, I started by adding you as collaborator, so that you can push directly to the repository here on Github if you wish.

Regards, Baldur

melo commented 6 years ago

Hi Baldur,

Sorry for not posting the code earlier.

This the module I used for the benchmarks: https://gist.github.com/melo/fca85773979822f1107dab15efbad70e

I also included the benchmark script.

The speedups are simple to explain: I dropped the use of BigInt's, and the encode/decode is custom-made and unrolled for maximum efficiency. The only way to make this faster would be to make an XS version of the Crockford Base32 encode/decode.

As you can see I focused on the ULID generation, both text and binary, and not on the helper methods you also have. I expect those to be kept simple.

I'll start working on the PR over this week, and hopefully, have something to push in the weekend. As I said, the ULID.pm I'm using is good enough to unblock our internal developments.

I also accepted your invitation to this repo, so I'll merge it when you approve the code.

Thank you,

bk commented 6 years ago

Hi again Pedro,

It's some time since I wrote this, but if I remember correctly I used BigInts so as to be able to support 32-bit systems. Is this something you have considered?

melo commented 6 years ago

Hi,

yes, I saw that in your code. The code I wrote is mostly shifts and other logical operations, and I would make it 64 and 32-bit compliant. Right now it is only 64-bit compliant because that is what I needed.

My biggest problem is how to test the code for 32 bit, I have no idea on how to do that. I don't even have access to a 32bit CPU anymore I think...

Thanks,

bk commented 6 years ago

Hi Pedro, At the time I actually had a 32-bit virtual machine on Digital Ocean, but another simple solution is to use VirtualBox, which supports 32-bit guest systems on a 64-bit host.

bk commented 2 years ago

Closing as obsolete.