oklog / ulid

Universally Unique Lexicographically Sortable Identifier (ULID) in Go
Apache License 2.0
4.44k stars 161 forks source link

Duplicate ID's #70

Closed paulm17 closed 3 years ago

paulm17 commented 3 years ago

I'm adding to the DB in very quick succession around 1000 rows.

I decided to output the ULIDs so I could catch if there were dupes:

Here are the relevant parts:

01F9VG0NT0ACNSHJM2J4G6WH3E
01F9VG0NT0ACNSHJM2J8WRTTJE
01F9VG0NT0ACNSHJM2JAC9W6YT
01F9VG0NT0ACNSHJM2JDWKQS65
01F9VG0NT0ACNSHJM2J8KV2SB2
01F9VG0NT0ACNSHJM2J4G6WH3E
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM2J4G6WH3E' for key 'PRIMARY'
01F9VG0NT0ACNSHJM2N53XTXJH
01F9VG0NT0ACNSHJM2N8BWJXPF
01F9VG0NT0ACNSHJM2N8QXRBDM
01F9VG0NT0ACNSHJM2N53XTXJH
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM2N53XTXJH' for key 'PRIMARY'
01F9VG0NT0ACNSHJM2SMR13HZ2
01F9VG0NT0ACNSHJM2SQVZW987
01F9VG0NT0ACNSHJM2SVQJ87QQ
01F9VG0NT0ACNSHJM2SMR13HZ2
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM2SMR13HZ2' for key 'PRIMARY'
01F9VG0NT0ACNSHJM30062WA6F
01F9VG0NT0ACNSHJM30297Z6DR
01F9VG0NT0ACNSHJM30062WA6F
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM30062WA6F' for key 'PRIMARY'
01F9VG0NT0ACNSHJM39T0D5B1B
01F9VG0NT0ACNSHJM39ZVD159R
01F9VG0NT0ACNSHJM3A59EMVZC
01F9VG0NT0ACNSHJM39XSN0A84
01F9VG0NT0ACNSHJM3A34FHCAR
01F9VG0NT0ACNSHJM3A5CKZKJT
01F9VG0NT0ACNSHJM39T0D5B1B
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM39T0D5B1B' for key 'PRIMARY'
01F9VG0NT0ACNSHJM3APRCBXA7
01F9VG0NT0ACNSHJM3ATGM0XHQ
01F9VG0NT0ACNSHJM3APRCBXA7
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM3APRCBXA7' for key 'PRIMARY'
01F9VG0NT0ACNSHJM3CTT1B09C
01F9VG0NT0ACNSHJM3CZHHK5QT
01F9VG0NT0ACNSHJM3D12FM3C4
01F9VG0NT0ACNSHJM3CXQGM6GM
01F9VG0NT0ACNSHJM3D3H1M9EE
01F9VG0NT0ACNSHJM3D4REME36
01F9VG0NT0ACNSHJM3CTT1B09C
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM3CTT1B09C' for key 'PRIMARY'
01F9VG0NT0ACNSHJM3DPF00J7M
01F9VG0NT0ACNSHJM3DG1C0KAC
01F9VG0NT0ACNSHJM3DQR5G89S
01F9VG0NT0ACNSHJM3DV3SHQ6M
01F9VG0NT0ACNSHJM3DY6049MW
01F9VG0NT0ACNSHJM3DPF00J7M
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM3DPF00J7M' for key 'PRIMARY'
01F9VG0NT0ACNSHJM3VYDT1FKJ
01F9VG0NT0ACNSHJM3VYDT1FKJ
Error 1062: Duplicate entry '01F9VG0NT0ACNSHJM3VYDT1FKJ' for key 'PRIMARY'

My code is broken into three parts, as it's a complex long running process:

stored in a struct:

now := time.Now().UTC()
entropy := ulid.Monotonic(rand.New(rand.NewSource(now.UnixNano())), 0)

I generate the ID (m is a struct):

job.ID = common.GenUlid(m.now, m.entropy)

Func:

func GenUlid(now time.Time, entropy io.Reader) string {
    return ulid.MustNew(ulid.Timestamp(now), entropy).String()
}

Thanks for your time.

powerman commented 3 years ago

Why don't you use crypto/rand?

paulm17 commented 3 years ago

Didn't know I had to. Would it solve the issue?

Can you give an example of usage. Would be happy to test and report back.

powerman commented 3 years ago

No, you don't have to. So far I'm unable to reproduce this issue, running this in a shell loop:

package main

import (
    "fmt"
    "log"
    "math/rand"
    "time"

    "github.com/oklog/ulid"
)

func main() {
    t := time.Now()
    // t := time.Unix(1000000, 0)
    entropy := ulid.Monotonic(rand.New(rand.NewSource(t.UnixNano())), 0)
    // entropy := rand.Reader
    // entropy := ulid.Monotonic(rand.Reader, 0)
    ids := make([]ulid.ULID, 10000)
    for i := range ids {
        ids[i] = ulid.MustNew(ulid.Timestamp(t), entropy)
    }
    seen := make(map[ulid.ULID]bool)
    for _, id := range ids {
        fmt.Println(id)
        if seen[id] {
            log.Fatal("dup")
        }
        seen[id] = true
    }
}
paulm17 commented 3 years ago

Hmm, not sure what to say. I've tested satori.go/uuid and rs/xid. Both had no duplications.

I'm going to close this for now.

powerman commented 3 years ago

@paulm17 Just in case you didn't notice this - I'm not the repo owner, so don't take my "I can't reproduce this" as owner's "I wont fix this". So, I'm not sure it makes sense to close this if you are seeing this weird behaviour. But short code example which reproduce the issue at least on your system would be really nice to have.

peterbourgon commented 3 years ago

What is m.now? Hopefully not the same for each ulid.MustNew?

peterbourgon commented 3 years ago

This is eerily similar to #46, where I am fairly confident the issue was the DB or the DB driver truncating the ULID.

To check, compare the generated ULID in your code with the ULID actually stored in the DB.

peterbourgon commented 3 years ago

To be explicit, each time you generate a ULID, you should use the same entropy source, and the current (i.e. a new) timestamp. The entropy source should be created once, at the start of your process, and provided to whatever component(s) need to generate ULIDs; you shouldn't create a new entropy source per job, or whatever. And the timestamp should be captured each time you call ulid.New, rather than captured once per job and re-used.

now := time.Now().UTC()
entropy := ulid.Monotonic(rand.New(rand.NewSource(now.UnixNano())), 0)

It is probably correct to store entropy as a struct field and re-use it. Hopefully you just do this once in your entire program? You should not store or re-use the now timestamp.

job.ID = common.GenUlid(m.now, m.entropy)

Oops — passing m.now is almost certainly a bug.

job.ID = common.GenUlid(time.Now(), m.entropy)
paulm17 commented 3 years ago

Even when I'm using a new instance of time.Now() for the GenUlid function, I still get duplications.

When using xid for this particular use case and I get no dupes.

| c3i1og23q5669c1odvgg       | 2021-07-06 09:02:56.919 |
| c3i1og23q5669c1odvh0       | 2021-07-06 09:02:56.919 |
| c3i1og23q5669c1odvhg       | 2021-07-06 09:02:56.919 |

Feel free to close.

paulm17 commented 3 years ago

@giautm

c3i1og23q5669c1odvgg is xid: https://github.com/rs/xid

giautm commented 3 years ago

Can you share your code that causes duplicate? I also don't find any update for 10k generate.

https://play.golang.org/p/GZotlS37bJf

paulm17 commented 3 years ago

@giautm It's exactly the same as your play example.

peterbourgon commented 3 years ago

Don't insert them to the database, put them in a map in your code. Are the duplicates in the map? I bet there aren't.

paulm17 commented 3 years ago

In the post above. I'm printing out the ID way before the actual insert.

Anyway, closing due to moving to another lib.

peterbourgon commented 3 years ago

I am confident the problem is not in this package.

peterbourgon commented 3 years ago

And should be using ulid/v2, for the record.

rokiyama commented 3 years ago

I had the same problem when using goroutines, and using crypto/rand solved it.

example:

package main

import (
    crand "crypto/rand"
    "fmt"
    "math/rand"
    "sync"
    "testing"
    "time"

    "github.com/oklog/ulid/v2"
)

func Test1(_ *testing.T) {
    var wg sync.WaitGroup
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func() {
            t := time.Now()
            entropy := ulid.Monotonic(rand.New(rand.NewSource(t.UnixNano())), 0)
            v, err := ulid.New(ulid.Timestamp(t), entropy)
            if err != nil {
                fmt.Printf("err: %s\n", err)
            } else {
                fmt.Printf("%s\n", v.String())
            }
            wg.Done()
        }()
    }
    wg.Wait()
}

func Test2(_ *testing.T) {
    var wg sync.WaitGroup
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func() {
            t := time.Now()
            v, err := ulid.New(ulid.Timestamp(t), crand.Reader)
            if err != nil {
                fmt.Printf("err: %s\n", err)
            } else {
                fmt.Printf("%s\n", v.String())
            }
            wg.Done()
        }()
    }
    wg.Wait()
}

Running Test1 generates duplicate IDs, I found this by running sort -u, Test2 does not.

$ go test -v -run Test1 > out
$ wc -l out; cat out | sort -u | wc -l
    1004 out
     849

$ go test -v -run Test2 > out
$ wc -l out; cat out | sort -u | wc -l
    1004 out
    1004
peterbourgon commented 3 years ago

This is probably less to do with crypto/rand itself, more to do with the fact that generating a cryptographically secure random number takes significantly longer, and therefore Test2 is generating ULIDs that span more milliseconds than Test1. (ULIDs only have millisecond time precision.)

peterbourgon commented 3 years ago

e.g.

package main

import (
    "encoding/binary"
    "fmt"
    "sync"
    "sync/atomic"
    "testing"
    "time"

    "github.com/oklog/ulid/v2"
)

type notReallyEntropy uint64

func (u *notReallyEntropy) Read(p []byte) (int, error) {
    v := atomic.AddUint64((*uint64)(u), 1)
    binary.BigEndian.PutUint64(p, v)
    return len(p), nil
}

func Test3(_ *testing.T) {
    var entropy notReallyEntropy
    var wg sync.WaitGroup
    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func() {
            t := time.Now()
            v, err := ulid.New(ulid.Timestamp(t), &entropy)
            if err != nil {
                fmt.Printf("err: %s\n", err)
            } else {
                fmt.Printf("%s\n", v.String())
            }
            wg.Done()
        }()
    }
    wg.Wait()
}
$ go test -v -run Test3 > out
$ wc -l out ; cat out | sort -u | wc -l
    1004 out
    1004
powerman commented 3 years ago

This is probably less to do with crypto/rand itself, more to do with the fact that generating a cryptographically secure random number takes significantly longer, and therefore Test2 is generating ULIDs that span more milliseconds than Test1. (ULIDs only have millisecond time precision.)

I suppose math/rand just can generate duplicate random values with much higher probability than crypto/rand. E.g. if we change your Test3 to use binary.BigEndian.PutUint64(p, v%256) the about half of generated ULIDs will became duplicates on my system.

If we had a reproducible example we can try to pre-generate random values from both libraries and thus eliminate difference in generation time.

peterbourgon commented 3 years ago

I mean, ULID generation is actually perfectly deterministic. It consists of two parts: 48 bits of time, and 80 bits of "entropy" which is just 10 bytes that you claim are random. If you provide the same time and entropy components you get the same ULID.

The package gives you some helpers to generate these parts.

The ulid.Timestamp helper lets you pass a time.Time order to get a uint64 that represents that time in Unix milliseconds. If you run ULID generation in a loop, and pass time.Now in each loop iteration, and the whole thing runs entirely in the same millisecond, then the time part of every ULID will be the same. If it splits over 2 different milliseconds, then you will have some ULIDs with one time component, and some with another. If each loop iteration is expensive — say because it has to generate cryptographically secure random data each time, too — then it's likely to take longer, and your final set of ULIDs will have many unique time components.

The package models entropy as an io.Reader for convenience but every time you make a ULID it just reads 10 bytes and uses them. If all you care about is not getting duplicates, then you can just write an io.Reader that returns [0 0 0 0 0 0 0 0 0 1] on the first read, [0 0 0 0 0 0 0 0 0 2] on the second, all the way up to [0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff], and then looping back. This will maximize the uniqueness. But it's of course totally predictable. The package's Monotonic helper does basically this, but adds in a configurable amount of entropy to avoid the predictability.

So I guess I'm saying there shouldn't really be any surprises here, and — as far as I know — there's no bugs in the package or anything. It's just a question of understanding what's going on and using the tools we give you correctly.

powerman commented 3 years ago

It's just a question of understanding what's going on and using the tools we give you correctly.

I totally agree, but this issue probably means there are some (unclear) case when it's easier to misuse ULID than satori.go/uuid or rs/xid. It would be nice if @paulm17 find out what exactly was done wrong, as it may help us improve ULID docs or helper methods.

rokiyama commented 3 years ago

This is probably less to do with crypto/rand itself, more to do with the fact that generating a cryptographically secure random number takes significantly longer, and therefore Test2 is generating ULIDs that span more milliseconds than Test1. (ULIDs only have millisecond time precision.)

I tried benchmarking the tests, and got the results:

$ go test -bench .
goos: darwin
goarch: amd64
pkg: github.com/rokiyama/example-ulid
cpu: Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz
Benchmark1-12                             202995              6276 ns/op
Benchmark2-12                            2417226               549.5 ns/op
Benchmark3-12                            2826193               441.7 ns/op
PASS
ok      github.com/rokiyama/example-ulid        10.751s

(I renamed Test1 to Benchmark1 and made some changes, the test code is: https://github.com/rokiyama/example-ulid/blob/main/main_test.go )

Since crypto/rand is much faster than math/rand, it seems unlikely that ULIDs are being generated in milliseconds. (I don't know why crypto/rand is so fast. My platform is macos) Anyway, crypto/rand generates unique IDs as expected, and your notReallyEntropy works as expected and even faster.

So, now I understand:

peterbourgon commented 3 years ago

Since crypto/rand is much faster than math/rand

Your benchmark is unsound, and as a consequence this conclusion is backwards.

I think we've captured everything we're gonna get here.