peeringdb / peeringdb-py

PeeringDB python client
BSD 2-Clause "Simplified" License
89 stars 22 forks source link

DB sync fails due to duplicate entries #85

Closed fdomain closed 4 months ago

fdomain commented 9 months ago

Hello,

I tried to bootstrap a local peeringdb copy using peeringdb-py, however I can't make a full db initialization, the sync crashes due to (I suppose) duplicates entries.

Local environment

pip list  | grep -E "peeringdb|Django"
Django             4.2
django-peeringdb   3.2.0
peeringdb          2.0.0

Description

The db is not fully synced as it fails when gathering the poc resources, the program exists with the following error: Network with this Name already exists.

By looking deeper into this issue, the sync fails when trying to add a dangling relationship for the Network 34901 (https://www.peeringdb.com/net/34901). Indeed, its network name Rapid-Fire-y is already used by another entry (Network 33473).

From the python interpreter:

>>> from peeringdb.client import Client
>>> from peeringdb import resource

>>> pdb = Client()
>>> asn = pdb.all(resource.Network).get(name="Rapid-Fire-y")
>>> asn.id
33473

Steps to reproduce

  1. Run peeringdb sync --fetch-private from a fresh install

Stacktrace


Syncing to https://www.peeringdb.com/api
[org] Fetching from remote cache
[org] Processing 27080 objects
[campus] Fetching from remote cache
[campus] Processing 37 objects
[fac] Fetching from remote cache
[fac] Processing 5277 objects
Fetching dangling relationship Campus 14
Fetching dangling relationship Campus 60
Fetching dangling relationship Campus 67
Fetching dangling relationship Campus 68
Fetching dangling relationship Campus 64
[net] Fetching from remote cache
[net] Processing 29049 objects
Fetching dangling relationship Organization 30220
Fetching dangling relationship Organization 30281
[ix] Fetching from remote cache
[ix] Processing 1145 objects
[carrier] Fetching from remote cache
[carrier] Processing 135 objects
[carrierfac] Fetching from remote cache
[carrierfac] Processing 1839 objects
Fetching dangling relationship Facility 921
Fetching dangling relationship Facility 471
Fetching dangling relationship Facility 4490
Fetching dangling relationship Facility 10524
Fetching dangling relationship Facility 13740
[ixfac] Fetching from remote cache
[ixfac] Processing 3523 objects
[ixlan] Fetching from API (private)
[ixlan] Processing 1155 objects
Fetching dangling relationship InternetExchange 4338
Fetching dangling relationship InternetExchange 4341
Fetching dangling relationship InternetExchange 4343
Fetching dangling relationship Organization 37008
Fetching dangling relationship InternetExchange 4345
Fetching dangling relationship Organization 37047
Fetching dangling relationship InternetExchange 4346
Fetching dangling relationship InternetExchange 4347
Fetching dangling relationship InternetExchange 4348
Fetching dangling relationship InternetExchange 4349
Fetching dangling relationship Organization 37074
Fetching dangling relationship InternetExchange 4350
Fetching dangling relationship InternetExchange 4352
Fetching dangling relationship InternetExchange 4353
Fetching dangling relationship Organization 37119
[ixpfx] Fetching from remote cache
[ixpfx] Processing 2212 objects
Fetching dangling relationship InternetExchangeLan 3323
Fetching dangling relationship InternetExchange 3323
[netfac] Fetching from remote cache
[netfac] Processing 47393 objects
[netixlan] Fetching from remote cache
[netixlan] Processing 52217 objects
Fetching dangling relationship InternetExchangeLan 1991
[poc] Fetching from API (private)
[poc] Processing 48085 objects
Fetching dangling relationship Network 34873
Fetching dangling relationship Network 34882
Fetching dangling relationship Network 34887
Fetching dangling relationship Organization 36990
Fetching dangling relationship Network 34884
Fetching dangling relationship Organization 36989
Fetching dangling relationship Network 34889
Fetching dangling relationship Organization 36992
Fetching dangling relationship Network 34901
Traceback (most recent call last):
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 83, in create_obj
    self.backend.get_object(self.backend.get_concrete(resource), pk)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/backend.py", line 27, in wrapped
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django_peeringdb/client_adaptor/backend.py", line 94, in get_object
    return concrete.objects.get(pk=id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 637, in get
    raise self.model.DoesNotExist(
django_peeringdb.models.concrete.Network.DoesNotExist: Network matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 142, in create_obj
    self.clean_obj(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 69, in clean_obj
    raise e
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 59, in clean_obj
    self.backend.clean(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django_peeringdb/client_adaptor/backend.py", line 142, in clean
    obj.full_clean()
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/base.py", line 1502, in full_clean
    raise ValidationError(errors)
django.core.exceptions.ValidationError: {'name': ['Network with this Name already exists.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/f.domain/dev/peeringdb-py/.venv/bin/peeringdb", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/cli.py", line 68, in main
    return handler(config=cfg, **vars(options))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/commands.py", line 20, in _wrapped
    r = func(*a, **k)
        ^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/commands.py", line 262, in handle
    client.updater.update_all(rs, since, fetch_private=kwargs["fetch_private"])
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 235, in update_all
    self._handle_initial_sync(entries, res)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 164, in _handle_initial_sync
    obj, ret = self.create_obj(row, res)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 95, in create_obj
    rel_obj, _ = self.create_obj(related_row, resource)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 147, in create_obj
    self.clean_obj(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 69, in clean_obj
    raise e
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 59, in clean_obj
    self.backend.clean(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django_peeringdb/client_adaptor/backend.py", line 142, in clean
    obj.full_clean()
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/base.py", line 1502, in full_clean
    raise ValidationError(errors)
django.core.exceptions.ValidationError: {'name': ['Network with this Name already exists.']}```
fdomain commented 7 months ago

I just gave another try today, and I still get a similar error (on another object though):

[poc] Fetching from API (private)
[poc] Processing 49028 objects
Fetching dangling relationship Network 35061
Fetching dangling relationship Network 35090
Fetching dangling relationship Organization 37182
Fetching dangling relationship Network 35094
Fetching dangling relationship Organization 37185
Fetching dangling relationship Network 35098
Fetching dangling relationship Organization 37187
Fetching dangling relationship Network 35110
Fetching dangling relationship Organization 37196
Fetching dangling relationship Network 35102
Fetching dangling relationship Organization 37192
Fetching dangling relationship Network 35113
Fetching dangling relationship Organization 37198
Fetching dangling relationship Network 35083
Fetching dangling relationship Network 35106
Traceback (most recent call last):
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 83, in create_obj
    self.backend.get_object(self.backend.get_concrete(resource), pk)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/backend.py", line 27, in wrapped
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django_peeringdb/client_adaptor/backend.py", line 94, in get_object
    return concrete.objects.get(pk=id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 637, in get
    raise self.model.DoesNotExist(
django_peeringdb.models.concrete.Network.DoesNotExist: Network matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 142, in create_obj
    self.clean_obj(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 69, in clean_obj
    raise e
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 59, in clean_obj
    self.backend.clean(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django_peeringdb/client_adaptor/backend.py", line 142, in clean
    obj.full_clean()
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/base.py", line 1502, in full_clean
    raise ValidationError(errors)
django.core.exceptions.ValidationError: {'name': ['Network with this Name already exists.']}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/f.domain/dev/peeringdb-py/.venv/bin/peeringdb", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/cli.py", line 68, in main
    return handler(config=cfg, **vars(options))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/commands.py", line 20, in _wrapped
    r = func(*a, **k)
        ^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/commands.py", line 262, in handle
    client.updater.update_all(rs, since, fetch_private=kwargs["fetch_private"])
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 235, in update_all
    self._handle_initial_sync(entries, res)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 164, in _handle_initial_sync
    obj, ret = self.create_obj(row, res)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 95, in create_obj
    rel_obj, _ = self.create_obj(related_row, resource)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 147, in create_obj
    self.clean_obj(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 69, in clean_obj
    raise e
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/peeringdb/_update.py", line 59, in clean_obj
    self.backend.clean(obj)
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django_peeringdb/client_adaptor/backend.py", line 142, in clean
    obj.full_clean()
  File "/home/f.domain/dev/peeringdb-py/.venv/lib/python3.11/site-packages/django/db/models/base.py", line 1502, in full_clean
    raise ValidationError(errors)
django.core.exceptions.ValidationError: {'name': ['Network with this Name already exists.']}

This is still from the command peeringdb sync --fetch-private

vegu commented 6 months ago

am able to reproduce, think this needs a code fix

edit: to be clear, seems to only happen with --fetch-private argument passed