py-bson / bson

Independent BSON codec for Python that doesn't depend on MongoDB.
Other
439 stars 81 forks source link

objectid: After 2038, object id generation will fail. #92

Open yaoqibin opened 5 years ago

yaoqibin commented 5 years ago

The code as follows: def __generate(self): """Generate a new value for this ObjectId. """

    # 4 bytes current time
    oid = struct.pack(">i", int(time.time()))

Error message as follows: packages/bson/objectid.py", line 170, in __generate oid = struct.pack(">i", int(time.time())) error: 'i' format requires -2147483648 <= number <= 2147483647

Parkayun commented 5 years ago

can you give me input data?

yaoqibin commented 5 years ago

can you give me input data?

I just change the system time to 2100, then this bug ouccr.

date

Wed Jan 6 09:46:16 JST 2100

python

Python 2.7.5 (default, Aug 29 2016, 10:12:21) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

import struct, time struct.pack(">i", int(time.time())) Traceback (most recent call last): File "", line 1, in struct.error: 'i' format requires -2147483648 <= number <= 2147483647

yaoqibin commented 5 years ago

The max of int is 2147483647(2^31-1) and the seconds of 2038/1/19 03:14:07 UTC are 2147483647, after this time, the int is out of range.

yaoqibin commented 5 years ago

We may use a unsigned long int(struct.pack(">L", int(time.time()))) to describe the time, but the object id may use extra 4 bytes.

amcgregor commented 5 years ago

The only "correct" way to correct this, preserving the lexicographical sort order of the IDs, is to expand the size of that field as we more closely approach the limit. This would likely require re-encoding of every ID present in the dataset, as variable width fields will not sort correctly. If you're going to expand, go whole-hog. Bump from 32- to 64-bit and have 54 zeptoseconds (~585 billion years) to worry about the next precision hike.

wegylexy commented 2 years ago

ObjectId("ffffffffffffffff00000000").getTimestamp() is ISODate("2106-02-07T06:28:15.000Z"), so having it unsigned is the "correct" way for now.