uuidjs / uuid

Generate RFC-compliant UUIDs in JavaScript
MIT License
14.66k stars 903 forks source link

v7: passing `options.msecs` results in zero-value timestamp #764

Closed tai-kun closed 4 months ago

tai-kun commented 5 months ago

In uuid.v7, when the msecs option is specified without the seq option, the timestamp becomes nil despite being a value greater than 0. Is this intentional behavior? From what I can see in the implementation, it appears that uuid.v7 tries to manage the monotonic sequence counter unless it is specified as an option.

Example:

import { v7 as uuidv7 } from "uuid";

const msecs = new Date("2024-06-12").getTime();

console.log(msecs);
// Print:
// 1718150400000

console.log(uuidv7({ msecs }));
// Print:
// 00000000-0000-7dd6-bd36-038a8af1a5f1
// ^^^^^^^^ ^^^^ ^                      <- In this script, the is fixed.
//                ^^^ ^^^^ ^^^^^^^^^^^^ <- Changes with each execution.

console.log(uuidv7({ msecs, seq: 0 }));
// Print:
// 019009be-8800-7000-8000-061f2f366242
// ^^^^^^^^ ^^^^ ^^^^ ^^^^ ^            <- In this script, the is fixed.
//                          ^^^^^^^^^^^ <- Changes with each execution.

console.log(parseInt("019009be" + "8800", 16));
// Print:
// 1718150400000 <- This is the same as the value of msecs.

Addition 2 days after issue creation:

Since the bug label was added, I am updating the information based on the ISSUE_TEMPLATE.

Description of the problem

Please refer to the above.

Recipe for reproducing

Please refer to the "Example:" above.

Additional information

N/A

Environment

  System:
    OS: Linux 6.5 Ubuntu 22.04.4 LTS 22.04.4 LTS (Jammy Jellyfish)
    CPU: (24) x64 12th Gen Intel(R) Core(TM) i9-12900K
    Memory: 25.67 GB / 31.11 GB
    Container: Yes
    Shell: 5.1.16 - /bin/bash
  Binaries:
    Node: 22.2.0 - ~/.volta/tools/image/node/22.2.0/bin/node
    npm: 10.7.0 - ~/.volta/tools/image/node/22.2.0/bin/npm
  Browsers:
    Chromium: 125.0.6422.141
  npmPackages:
    @tsconfig/strictest: ^2.0.5 => 2.0.5 
    @types/node: ^20.14.2 => 20.14.2 
    @types/unzipper: ^0.10.9 => 0.10.9 
    dprint: ^0.46.2 => 0.46.2 
    esbuild: ^0.21.4 => 0.21.4 
    is-plain-obj: ^4.1.0 => 4.1.0 
    tar: ^7.2.0 => 7.2.0 
    type-fest: ^4.20.0 => 4.20.0 
    typescript: ^5.4.5 => 5.4.5 
    unzipper: ^0.12.1 => 0.12.1 
    uuid: ^10.0.0 => 10.0.0
broofa commented 5 months ago

@pmccarren

broofa commented 5 months ago

@pmccarren I've confirmed this is an issue. I think this occurs because the timestamp field is set from the internal _msecs value, which is initialized to zero. And passing options.msecs triggers this code path which doesn't update _msecs to the user-provided value.

Can you take a look?

robinpokorny commented 5 months ago

TL;DR: Providing any custom msecs results in a wrong UUID.

Looking at the code, this is much deeper issue with any user-provided msecs. It is essentially treated as a replacement of the system clock and so its value is compared to previous runs and it is stored for the future runs.

So if you run it first without custom msecs, it will remember the current time and reject any later msecs that is sufficently in the past or in the future. It will also not reset the random part properly.

See this sequence:

> v7({msecs: Date.UTC(2022)})
'00000000-0000-755b-b297-8ed274801af9'
// Wrong
> v7()
'019016b9-1098-755b-b297-9502eecb44e3'
// Correct time, seq_high unchaged
> v7({msecs: Date.UTC(2022)})
'017e12ef-9c00-755b-b297-94f2d59d9a23'
// Correct time, seq_high unchaged
> v7({msecs: Date.UTC(2028)})
'017e12ef-9c00-755b-b297-98a0301b8a02'
// Wrong time! Uses previous timestamp. seq_high unchaged
> v7()
'019016b9-4fe0-755b-b297-a148d28e3621'
// Correct time, seq_high unchaged

Problems

  1. Until the first call without a custom timestamp, no custom msecs will work at all.
  2. After a call with no timestamp, it's possible provide custom msecs in the past. However, it's impossible to move to a future from the last run. It's only possible to reset to now with a no custom timestamp call.
  3. The seq_high (755b-b297 in the calls above) is not changed among calls.

I believe this has a high severity and we should take an immediate action, @broofa and @pmccarren

I'm writing a test for this in my fork, see here: https://github.com/robinpokorny/uuid/commit/9eb076a0a36bd86958f220328aaf2bc54bf9f1c4

broofa commented 5 months ago

@robinpokorny All good points, and thanks for putting that test together.


This does raise a couple questions that I don't have good answers to at the moment:

Should IDs generated with user-supplied options read / mutate internal state at all?

It's been a good while since I gave any thought to this aspect of the uuid API. At the moment, I'm of the mind that allowing calls with user-specified options to read or write internal state is probably a bad idea in general. Certainly modifying internal state based on user-supplied options risks destabilizing code that may not be expecting that. And relying on internal state means we can't insure idempotent behavior, which makes writing test code difficult.

Where user-provided options overlap with internal state, do they represent the state prior to UUID generation or the state reflected in the generated UUID? E.g. If a user passes options.seq: 0, is that the initial or final value of seq after the uuid has been generated? Basically uuid generation (at least where v1 and v7) is concerned has two parts: updating the internal state, and then generating the uuid from that state. If the user supplies options, are they specifying internal state values before or after the update step?

robinpokorny commented 5 months ago

I created a PR that fixes the main issue.


Should IDs generated with user-supplied options read / mutate internal state at all?

We should support generation of multiple UUIDs within the same millisecond that was user-provided. Ideally with monotonicity. One example is backfilling. Current time is then treated as a default and does not have a special treating.

One unsolved (and probably unsolvable) case is when the the function is called by alternating timestamps. Then each time it's treated as the first and monotonicity is lost. I can imagine this can happen as this library is quite popular so multiple sources can call it simultaneously. I wrote about this in the original PR (point 5): https://github.com/uuidjs/uuid/pull/681#issuecomment-1523106489