coleifer / peewee

a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
http://docs.peewee-orm.com/
MIT License
11.19k stars 1.37k forks source link

About peewee.IntegrityError error when using peewee get_or_create #2948

Closed ponponon closed 1 week ago

ponponon commented 1 week ago

About peewee.IntegrityError error when using peewee get_or_create

Here's the full code log

error handling worker <WorkerContext [taisan_task_ocr_service.start_ocr] at 0x7efd9bd31410>: (1062, "Duplicate entry '2147483647' for key 'found_video_cover_image_ocr_count.foundedvideocoverimageocrcounttable_video_id'")
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 7285, in get
    return clone.execute(database)[0]
           ~~~~~~~~~~~~~~~~~~~~~~~^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 4581, in __getitem__
    return self.row_cache[item]
           ~~~~~~~~~~~~~~^^^^^^
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6875, in get_or_create
    return query.get(), False
           ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 7288, in get
    raise self.model.DoesNotExist('%s instance matching query does '
core.mysql.models.FoundedVideoCoverImageOcrCountTableDoesNotExist: <Model: FoundedVideoCoverImageOcrCountTable> instance matching query does not exist:
SQL: SELECT `t1`.`id`, `t1`.`video_id`, `t1`.`search_count`, `t1`.`created_at`, `t1`.`updated_at` FROM `found_video_cover_image_ocr_count` AS `t1` WHERE (`t1`.`video_id` = %s) LIMIT %s OFFSET %s
Params: [2397088498, 1, 0]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/nameko/containers.py", line 392, in _run_worker
    result = method(*worker_ctx.args, **worker_ctx.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/services/taisan_task_ocr_service.py", line 25, in start_ocr
    fvcisc, fvcisc_created = FoundedVideoCoverImageOcrCountTable.get_or_create(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6877, in get_or_create
    raise exc
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6872, in get_or_create
    return cls.create(**kwargs), True
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6741, in create
    inst.save(force_insert=True)
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 6951, in save
    pk = self.insert(**field_dict).execute()
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2036, in inner
    return method(self, database, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2107, in execute
    return self._execute(database)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2912, in _execute
    return super(Insert, self)._execute(database)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 2625, in _execute
    cursor = database.execute(self)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3330, in execute
    return self.execute_sql(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/playhouse/shortcuts.py", line 256, in execute_sql
    return self._reconnect(super(ReconnectMixin, self).execute_sql, sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/playhouse/shortcuts.py", line 268, in _reconnect
    raise exc
  File "/usr/local/lib/python3.11/site-packages/playhouse/shortcuts.py", line 263, in _reconnect
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3320, in execute_sql
    with __exception_wrapper__:
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3088, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 196, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.11/site-packages/peewee.py", line 3322, in execute_sql
    cursor.execute(sql, params or ())
  File "/usr/local/lib/python3.11/site-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
             ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 563, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 825, in _read_query_result
    result.read()
  File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 1199, in read
    first_packet = self.connection._read_packet()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pymysql/connections.py", line 775, in _read_packet
    packet.raise_for_error()
  File "/usr/local/lib/python3.11/site-packages/pymysql/protocol.py", line 219, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/usr/local/lib/python3.11/site-packages/pymysql/err.py", line 150, in raise_mysql_exception
    raise errorclass(errno, errval)
peewee.IntegrityError: (1062, "Duplicate entry '2147483647' for key 'found_video_cover_image_ocr_count.foundedvideocoverimageocrcounttable_video_id'")

Here is peewee's model

class FoundedVideoCoverImageOcrCountTable(Model):
    id = BigAutoField(primary_key=True)
    video_id = IntegerField(unique=True)
    search_count = IntegerField(default=0)

    created_at = DateTimeField(
        null=False,
        constraints=[SQL('DEFAULT CURRENT_TIMESTAMP')],
        help_text='使用数据库时间'
    )
    updated_at = DateTimeField(
        null=False,
        constraints=[
            SQL('DEFAULT CURRENT_TIMESTAMP'),
            SQL('ON UPDATE CURRENT_TIMESTAMP'),
        ]
    )

    class Meta:
        database = super_taisan_db
        table_name = 'found_video_cover_image_ocr_count'

The ddl in the database.

-- super_taisan.found_video_cover_image_ocr_count definition

CREATE TABLE `found_video_cover_image_ocr_count` (
  `id` bigint NOT NULL AUTO_INCREMENT,
  `video_id` int NOT NULL,
  `search_count` int NOT NULL,
  `created_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_at` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  UNIQUE KEY `foundedvideocoverimageocrcounttable_video_id` (`video_id`)
) ENGINE=InnoDB AUTO_INCREMENT=76871889 DEFAULT CHARSET=utf8mb3;

Part of the code implemented

class TaiSanTaskSubmitService:
    name = 'taisan_task_ocr_service'

    def start_ocr(self, parse_result_json: str):
        # logger.debug(parse_result_json)

        parse_result = ParseResult.model_validate_json(parse_result_json)
        video_id: int = parse_result.extra['video_id']

        fvcisc, fvcisc_created = FoundedVideoCoverImageOcrCountTable.get_or_create(video_id=video_id)

        fvcisc: FoundedVideoCoverImageOcrCountTable
        if fvcisc.search_count >= 1 and timedelta_seconds(get_utc_now_timestamp(), fvcisc.updated_at) < 4*HOUR:
            logger.warning(f'当前图片重试次数过多,抛弃 video_id: {video_id}')

            fvcisc.search_count += 1
            fvcisc.save()

            return

Apparently the problem is in the FoundedVideoCoverImageOcrCountTable.get_or_create(video_id=video_id) stage

What confuses me is the secondary read, why does WHERE ( t1 .video_id= %s) LIMIT %s OFFSET %s Params: [2397088498, 1, 0] appear? This is the same as peewee.IntegrityError: (1062, “Duplicate entry ‘2147483647’ for key 'found_video_cover_image_ocr_count. foundedvideocoverimageocrcounttable_video_id'") of 2147483647 does not match. I don't know why the value 2397088498 appears?


When I try to minimally reproduce this reported error, the lack of reproduction doesn't work

video_id = 2147483647

FoundedVideoCoverImageOcrCountTable.get_or_create(
video_id=video_id)

Because the execution doesn't report an error

Out[4]: (<FoundedVideoCoverImageOcrCountTable: 3>, False)

python --version
Python 3.11.9

pip show peewee
Name: peewee
Version: 3.17.7
Summary: a little orm
Home-page: https://github.com/coleifer/peewee/
Author: Charles Leifer
Author-email: coleifer@gmail.com
License: MIT License
Location: /usr/local/lib/python3.11/site-packages
Requires: 
Required-by: fastapi-peewee-filter
ponponon commented 1 week ago

After my investigation, I know the reason, because the video id needs int64, but the table I created is int32, there is an overflow, which leads to inexplicable problems

coleifer commented 1 week ago

Yes this is just an overflow. The client library (pymysql or whatever you're using) is converting your 64-bit int into a 32-bit one and hence the problem.