nautobot / nautobot-app-device-lifecycle-mgmt

Device Lifecycle Management App for Nautobot
https://docs.nautobot.com/projects/device-lifecycle/en/latest/
Other
43 stars 25 forks source link

Multiple EoS Inventory Items at a Location Breaks Metrics Functions #309

Closed sdoiron0330 closed 7 months ago

sdoiron0330 commented 8 months ago

Environment

Expected Behavior

When the post_upgrade command is run before server start up, no errors occur. When you load the localhost/metrics endpoint, Prometheus metric data is populated.

Observed Behavior

The following error occurs (line 217 of DLM, metrics_lcm_hw_end_of_support):

Traceback (most recent call last):
  File "/usr/local/bin/nautobot-server", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nautobot/core/cli/__init__.py", line 52, in main
    run_app(
  File "/usr/local/lib/python3.11/site-packages/nautobot/core/runner/runner.py", line 297, in run_app
    management.execute_from_command_line([runner_name, command, *command_args])
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nautobot/core/management/commands/post_upgrade.py", line 91, in handle
    call_command(
  File "/usr/local/lib/python3.11/site-packages/django/core/management/__init__.py", line 181, in call_command
    return command.execute(*args, **defaults)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/core/management/base.py", line 89, in wrapped
    res = handle_func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/core/management/commands/migrate.py", line 268, in handle
    emit_post_migrate_signal(
  File "/usr/local/lib/python3.11/site-packages/django/core/management/sql.py", line 42, in emit_post_migrate_signal
    models.signals.post_migrate.send(
  File "/usr/local/lib/python3.11/site-packages/django/dispatch/dispatcher.py", line 180, in send
    return [
           ^
  File "/usr/local/lib/python3.11/site-packages/django/dispatch/dispatcher.py", line 181, in <listcomp>
    (receiver, receiver(signal=self, sender=sender, **named))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nautobot/core/apps/__init__.py", line 795, in post_migrate_send_nautobot_database_ready
    nautobot_database_ready.send(sender=app_conf, app_config=app_conf, **kwargs)
  File "/usr/local/lib/python3.11/site-packages/django/dispatch/dispatcher.py", line 180, in send
    return [
           ^
  File "/usr/local/lib/python3.11/site-packages/django/dispatch/dispatcher.py", line 181, in <listcomp>
    (receiver, receiver(signal=self, sender=sender, **named))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nautobot/extras/plugins/__init__.py", line 627, in discover_metrics
    for metric_instance in metric():
  File "/source/nautobot_device_lifecycle_mgmt/metrics.py", line 217, in metrics_lcm_hw_end_of_support
    for location_name, total_count in init_location_counts.annotate(
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 280, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 1324, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 140, in __iter__
    return compiler.results_iter(tuple_expected=True, chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 1130, in results_iter
    results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch, chunk_size=chunk_size)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 79, in _execute
    with self.db.wrap_database_errors:
  File "/usr/local/lib/python3.11/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django_prometheus/db/common.py", line 69, in execute
    return super().execute(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django.db.utils.ProgrammingError: more than one row returned by a subquery used as an expression

Steps to Reproduce

  1. Create a location and then create two devices at the location.
  2. For each device, create an unique inventory item with distinct part IDs.
  3. Under the DLM app, create a corresponding Hardware Notice for each part ID with end of support date in the past.
  4. Run nautobot-server post_upgrade or load localhost/metrics.
image image image image
sdoiron0330 commented 8 months ago

The only way I can think to solve this right now is replacing the subquery with something that would take the form of this query, but I couldn't figure out the right way in the Django ORM to do that so I thought I'd ask here first before diving more into it:

select location.name, count(location.name) 
from dcim_inventoryitem parts 
join dcim_device device on parts.device_id = device.id 
join dcim_location location on device.location_id = location.id 
group by location.name
progala commented 8 months ago

Thanks @sdoiron0330 . Issue appears to be connected to the Nautobot 2.0 migrating to use TreeQuerySet for some of the models. This breaks the removal of default ordering with order_by() and consequently breaks the expected behavior of the annotate when used with values.

sdoiron0330 commented 8 months ago

My client confirmed that they were also seeing a similar error in their Nautobot 1.6 instance today. I haven't dove into that as I'm actively working on upgrading them, but let me if those same reproduction steps causes the same issue locally.

progala commented 8 months ago

Perhaps this was in place longer than we thought but it is connected to the InventoryItem QuerySet using TreeQuerySet class which doesn't play nicely with the default order_by(). I've applied fix in my local environment and it seems to be working. I will do some more testing and will get a PR raised.

sdoiron0330 commented 8 months ago

Pulled your branch locally and the metrics looked to be working correctly. Similar results within my client's environment. Appreciate the quick turnaround on this!

progala commented 8 months ago

That's great! I'll get the PR in today.