peeringdb / peeringdb-py

PeeringDB python client
BSD 2-Clause "Simplified" License
89 stars 22 forks source link

Full initial sync failing #21

Closed bcavns01 closed 6 years ago

bcavns01 commented 6 years ago

During a full sync of of peeringdb, the sync is crashing. We followed the tickets that were similar to this, regarding failures during subsequent syncs, and the proposed solutions were to delete the database and perform a full sync, which we now do.

Prior to all syncs, we delete all peeringdb tables/databases and allow them to be rebuilt; however, we're now starting to see the same sync errors even when it's a fresh sync with no pre-existing data in the database.

Starting new HTTPS connection (1): www.peeringdb.com
https://www.peeringdb.com:443 "GET /api/net?since=0&limit=0 HTTP/1.1" 200 9598440
net last update 0 14083 changed
data to be processed 14083
{'org': [u'organization instance with id 21434 does not exist.']} : errors: {'org': [u'organization instance with id 21434 does not exist.']}
org: Missing Object, dict: {'_validators': [], 'auto_created': False, 'serialize': True, '_unique': False, 'unique_for_year': None, 'blank': False, 'help_text': u'', 'null': False, 'to_fields': ['id'], 'db_index': True, 'is_relation': True, 'unique_for_month': None, 'unique_for_date': None, 'primary_key': False, 'concrete': True, 'swappable': True, 'remote_field': <ManyToOneRel: django_peeringdb.network>, 'max_length': None, 'db_tablespace': u'', 'from_fields': [u'self'], 'verbose_name': u'org', '_get_default': <function return_None at 0x7f05c0ea4410>, 'creation_counter': 88, 'validators': [], 'editable': True, 'related_model': <class 'django_peeringdb.models.concrete.Organization'>, 'error_messages': {u'unique': u'%(model_name)s with this %(field_label)s already exists.', u'invalid': u'%(model)s instance with %(field)s %(value)r does not exist.', u'invalid_choice': u'Value %(value)r is not a valid choice.', u'blank': u'This field cannot be blank.', u'null': u'This field cannot be null.', u'unique_for_date': u'%(field_label)s must be unique for %(date_field_label)s %(lookup_type)s.'}, '_related_fields': [(<django.db.models.fields.related.ForeignKey: org>, <django.db.models.fields.AutoField: id>)], '_error_messages': None, 'db_constraint': True, '_verbose_name': None, 'name': 'org', 'db_column': None, 'default': <class django.db.models.fields.NOT_PROVIDED at 0x7f05c0ee7ae0>, 'choices': [], 'column': u'org_id', 'model': <class 'django_peeringdb.models.concrete.Network'>, 'attname': u'org_id', 'opts': <Options for Network>}
org.21434 not found locally, trying to fetch object... 
Starting new HTTPS connection (1): www.peeringdb.com
https://www.peeringdb.com:443 "GET /api/org/21434?depth=0 HTTP/1.1" 404 33
Traceback (most recent call last):
  File "/usr/local/bin/peeringdb", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/peeringdb/cli.py", line 164, in sync
    db.sync()
  File "/usr/local/lib/python2.7/dist-packages/peeringdb/localdb.py", line 124, in sync
    call_command('pdb_sync', interactive=False)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/__init__.py", line 131, in call_command
    return command.execute(*args, **defaults)
  File "/usr/local/lib/python2.7/dist-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python2.7/dist-packages/django_peeringdb/management/commands/pdb_sync.py", line 91, in handle
    self.sync(tables, pk, limit=limit)
  File "/usr/local/lib/python2.7/dist-packages/django_peeringdb/management/commands/pdb_sync.py", line 98, in sync
    self.update_db(cls, self.get_objs(cls, pk=pk, **kwargs))
  File "/usr/local/lib/python2.7/dist-packages/django_peeringdb/management/commands/pdb_sync.py", line 180, in update_db
    self._sync(cls, row)
  File "/usr/local/lib/python2.7/dist-packages/django_peeringdb/management/commands/pdb_sync.py", line 164, in _sync
    r = self.rpc.get(field, int(m.group(1)), depth=0)
  File "/usr/local/lib/python2.7/dist-packages/twentyc/rpc/client.py", line 117, in get
    return self._load(self._request(typ, id=id, params=kwargs))
  File "/usr/local/lib/python2.7/dist-packages/twentyc/rpc/client.py", line 88, in _load
    self._throw(res, data)
  File "/usr/local/lib/python2.7/dist-packages/twentyc/rpc/client.py", line 68, in _throw
    raise NotFoundException("%d %s" % (res.status_code, err))
twentyc.rpc.client.NotFoundException: 404 Not found.
vegu commented 6 years ago

Was a data issue server side, should be fixed - can you rerun a fresh sync to confirm, please?

bcavns01 commented 6 years ago

Thank you. Rerun made it to the end without error. Is there a UTC time window where we should avoid attempting to sync, or an api endpoint for checking peeringdb status to know that it's in the process of updating?

vegu commented 6 years ago

Generally you should be fine to sync whenever, personally i'd avoid UTC 00:00 simply because it's a very common time where everyone runs their sync, so it might be slower than others.

The sync on a fresh database uses cached responses that are refreshed roughly every 15 minutes.

Incremental syncs after the initial ones get realtime responses.