Cisco-Talos / clamav-safebrowsing

GNU General Public License v2.0
46 stars 6 forks source link

Duplicate entry 'xxxxxxx' for key 'sbclient_v4_prefixes.PRIMARY' #9

Open tuaris opened 2 years ago

tuaris commented 2 years ago

I'm running into a problem where after a few successful builds, clamsb errors out with a duplicate entry error. I am using MySQL server 8.0 on Ubuntu Linux 20.04.

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1771, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/default.py", line 717, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.8/dist-packages/MySQLdb/cursors.py", line 206, in execute
    res = self._query(query)
  File "/usr/local/lib/python3.8/dist-packages/MySQLdb/cursors.py", line 319, in _query
    db.query(q)
  File "/usr/local/lib/python3.8/dist-packages/MySQLdb/connections.py", line 259, in query
    _mysql.connection.query(self, query)
MySQLdb._exceptions.IntegrityError: (1062, "Duplicate entry '000002ca-2' for key 'sbclient_v4_prefixes.PRIMARY'")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/clamsbsync.py", line 4, in <module>
    __import__('pkg_resources').run_script('clamsb==4.0', 'clamsbsync.py')
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 667, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 1470, in run_script
    exec(script_code, namespace, namespace)
  File "/usr/local/lib/python3.8/dist-packages/clamsb-4.0-py3.8.egg/EGG-INFO/scripts/clamsbsync.py", line 599, in <module>
  File "/usr/local/lib/python3.8/dist-packages/clamsb-4.0-py3.8.egg/EGG-INFO/scripts/clamsbsync.py", line 518, in Sync
  File "/usr/local/lib/python3.8/dist-packages/clamsb-4.0-py3.8.egg/EGG-INFO/scripts/clamsbsync.py", line 415, in _handle_additions
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/session.py", line 2909, in merge
    self._autoflush()
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/session.py", line 2204, in _autoflush
    util.raise_(e, with_traceback=sys.exc_info()[2])
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/session.py", line 2193, in _autoflush
    self.flush()
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/session.py", line 3298, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/session.py", line 3438, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/session.py", line 3398, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/unitofwork.py", line 630, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/persistence.py", line 242, in save_obj
    _emit_insert_statements(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/orm/persistence.py", line 1094, in _emit_insert_statements
    c = connection._execute_20(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1583, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1452, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1814, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1995, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1771, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/dist-packages/SQLAlchemy-1.4.22-py3.8-linux-x86_64.egg/sqlalchemy/engine/default.py", line 717, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/local/lib/python3.8/dist-packages/MySQLdb/cursors.py", line 206, in execute
    res = self._query(query)
  File "/usr/local/lib/python3.8/dist-packages/MySQLdb/cursors.py", line 319, in _query
    db.query(q)
  File "/usr/local/lib/python3.8/dist-packages/MySQLdb/connections.py", line 259, in query
    _mysql.connection.query(self, query)
sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(MySQLdb._exceptions.IntegrityError) (1062, "Duplicate entry '000002ca-2' for key 'sbclient_v4_prefixes.PRIMARY'")
[SQL: INSERT INTO sbclient_v4_prefixes (prefix, reflist_id) VALUES (%s, %s)]
[parameters: (b'000002ca', 2)]

If I go in and delete the duplicate entry or use a new database it works for the next few builds until the same error (different key) appears.

kevlin2 commented 2 years ago

I would not recommend manually deleting entries.

Google's list updates, which clamav-safebrowsing pulls from, deletes entries using indices on a sorted list. Thus arbitrarily removing entries can have unintended side effects such as updates removing incorrect entries.

In this case, it'd be better to start with a fresh list.

kevlin2 commented 2 years ago

Duplicate keys might be due the original key not getting removed due to list misalignment and then Google adding the key back into the list.

micahsnyder commented 2 years ago

As @kevlin2 has described it - this is a difficult problem to reproduce, and because you can purge and re-pull to work around it. I don't want to sink additional time into solving this bug.

If someone from the community wishes to fix it, you're welcome to submit a PR. But we won't spend any more time on it.

I will leave the issue open since it is a real bug, and in case someone else wishes to fix it.