The func of New() cost more than 9 minute

Anteoy commented 5 years ago

My online app is use this lib by uuid.New(), About 3 million concurrent I found per call of uuid.New() cost more than 9 minutes. I found many mutex in it's call chain. Is there any problems?

Anteoy commented 5 years ago

pprof this is the pic of pprof

pborman commented 5 years ago

What operating system are you on?

In any event, the mutex is being used by the underlying crypto/rand package. Some sort of io.Reader that returns random data that can be called concurrently is required and not use a sync.Mutex on every read would be required.

I did think of a way to replace most mutex calls with an atomic.AddUint64 call. I just uploaded a version in https://github.com/pborman/cachedrander. Using this package I was able to speed up calls to uuid.New by about a factor of 6.7. Depending on what you are using these UUID's for I can imagine a faster way, but it would not be producing truly random values. Also, I just quickly threw this package together so perhaps there is a better solution.

There are two options for how to use the cachedrander package:

// One time initialization.
nr, err := cachedrander.NewUUIDReader(1000)
if err != nil {
    panic("rand.Reader failed")
}

// To get a new UUID.
id := uuid.Must(uuid.NewRandomFromReader(nr))

You can also use SetRand to make all requests use the new reader:

// One time initialization.
nr, err := cachedrander.NewUUIDReader(1000)
if err != nil {
    panic("rand.Reader failed")
}
uuid.SetRand(nr)

// Now call uuid.New as usual:
id := uuid.New()

The 1000 is how many uuid's worth of random data it should generate as needed. Setting it to 1000 means every 1000 uuid's it will block uuid creation until it is able to read up random data for 1000 more.

There is a theoretical race condition here, but I don't think you will hit it. If a call to uuid.New gets preempted at the right time and before it resumes there are somewhere between 1000 and 2000 additional calls to uuid.New prior to the first one resuming then it is possible for the same random data to be used for two different UUIDs. Those intermediate calls would also need to read 1000 * 16 bytes of random data twice.

Anteoy commented 5 years ago

My system is: Linux ll-025048236-FWWG.AppPZFW.prod.bj1 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 17:27:01 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux thanks for your reply. I use the timestamp and rand string to get my uuid. I will look at your cachedrander package later.

Anteoy commented 5 years ago

In theory, Is it better to use a timestamp to add a random string than the performance of this library?

pborman commented 5 years ago

You can just use 16 random bytes as long as you have a good cryptographic random number generator, this is what the UUID package does. The UUID package also has to make a few tweaks to set the version and variant. Below is from the packages godoc which in turn came from the UUID entry in Wikipedia:

Randomly generated UUIDs have 122 random bits. One's annual risk of being hit by a meteorite is estimated to be one chance in 17 billion, that means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate.

However, the main cost is fetching the random number. If you just want a unique value each time you call, then why not just use a uint64 as your key with the following, it is like 150x to 200x faster:

var index uint64

func NextIndex() uint64 {
    return atomic.AddUint64(&index, 1)
}

If you really need a unique UUID like object, but don't care about randomness, you could do:

func NextUUID() (id uuid.UUID) {
        *((*uint64)(unsafe.Pointer(&id))) = atomic.AddUint64(&index, 1)
        // The next two lines are optional.  They just make it look like a version 4 UUID.
        id[6] = (id[6] & 0x0f) | 0x40 // Version 4
        id[8] = (id[8] & 0x3f) | 0x80 // Variant is 10
        return id
}

If you do need a random component and since reading the random value is a majority of the time, you could combine these two, use the atomic.AddUint64 to generate the first 8 bytes and then read in 4 or 8 bytes from your pool of random data (it is almost a linear speedup, 8 bytes is nearly twice as fast as 16 and 4 bytes is nearly twice as fast as 8 bytes). The AddUint64 gives you 2^64 unique values in a row while the random part gives some randomness to it.

You could also fetch the timestamp counter from the CPU but I think there is a remote chance of getting the same value from two different cores. You would probably need to combine this with some unique value as well.

Anteoy commented 5 years ago

Thx

pborman commented 5 years ago

I finally wrote a test of using the TSC using github.com/dterei/gotsc. On a 4 core machine (MacBook Pro) I can typically generate a duplicate TSC within a few million calls running 4 concurrent goroutines that repeatedly sample the time stamp counter.

google / uuid

The func of New() cost more than 9 minute #47