databrickslabs / ucx

Automated migrations to Unity Catalog
Other
237 stars 83 forks source link

[BUG]: UCX assessment workflow failure at parse_logs #3374

Open bimalsebastian opened 1 day ago

bimalsebastian commented 1 day ago

Is there an existing issue for this?

Current Behavior

The last step of the assessment workflow, i.e. parse logs fails with the following error: "linting workflows(959063482862612) task failed: 'utf-8' codec can't decode byte 0x8c in position 42: invalid start byte"

Expected Behavior

No response

Steps To Reproduce

Once the installation is successful, run the UCS assessment workflow and observe the last step.

Cloud

Azure

Operating System

Linux

Version

latest via Databricks CLI

Relevant log output

InternalError: linting workflows(959063482862612) task failed: 'utf-8' codec can't decode byte 0x8c in position 42: invalid start byte
Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/blueprint/parallel.py", line 158, in inner
    return func(*args, **kwargs), None
           ^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 454, in lint_job
    problems, dfsas, tables = self._lint_job(job)
                              ^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 473, in _lint_job
    graph, advices, session_state = self._build_task_dependency_graph(task, job)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 523, in _build_task_dependency_graph
    problems = container.build_dependency_graph(graph)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 134, in build_dependency_graph
    return list(self._register_task_dependencies(parent))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 140, in _register_task_dependencies
    yield from self._register_notebook(graph)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 252, in _register_notebook
    return graph.register_notebook(path, False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/graph.py", line 58, in register_notebook
    maybe_graph = self.register_dependency(maybe.dependency)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/graph.py", line 89, in register_dependency
    container = dependency.load(self.path_lookup)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/graph.py", line 327, in load
    return self._loader.load_dependency(path_lookup, self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/notebooks/loaders.py", line 66, in load_dependency
    content = absolute_path.read_text("utf-8")
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/blueprint/paths.py", line 854, in read_text
    return f.read()
           ^^^^^^^^
  File "", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 42: invalid start byte
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
File ~/.ipykernel/81720/command--1-4051061095:18
     15 entry = [ep for ep in metadata.distribution("databricks_labs_ucx").entry_points if ep.name == "runtime"]
     16 if entry:
     17   # Load and execute the entrypoint, assumes no parameters
---> 18   entry[0].load()()
     19 else:
     20   import importlib

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/runtime.py:107, in main(*argv)
    105 if len(argv) == 0:
    106     argv = sys.argv
--> 107 Workflows.all().trigger(*argv)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/runtime.py:82, in Workflows.trigger(self, *argv)
     80 workflow = self._workflows[workflow_name]
     81 if task_name == "parse_logs":
---> 82     return ctx.task_run_warning_recorder.snapshot()
     83 # both CLI commands and workflow names appear in telemetry under `cmd`
     84 with_user_agent_extra("cmd", workflow_name)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/installer/logs.py:204, in TaskRunWarningRecorder.snapshot(self)
    202     error_messages.append(message)
    203 if len(error_messages) > 0:
--> 204     raise InternalError("\n".join(error_messages))
    205 return log_records

InternalError: linting workflows(959063482862612) task failed: 'utf-8' codec can't decode byte 0x8c in position 42: invalid start byte
Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/blueprint/parallel.py", line 158, in inner
    return func(*args, **kwargs), None
           ^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 454, in lint_job
    problems, dfsas, tables = self._lint_job(job)
                              ^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 473, in _lint_job
    graph, advices, session_state = self._build_task_dependency_graph(task, job)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 523, in _build_task_dependency_graph
    problems = container.build_dependency_graph(graph)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 134, in build_dependency_graph
    return list(self._register_task_dependencies(parent))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 140, in _register_task_dependencies
    yield from self._register_notebook(graph)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/jobs.py", line 252, in _register_notebook
    return graph.register_notebook(path, False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/graph.py", line 58, in register_notebook
    maybe_graph = self.register_dependency(maybe.dependency)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/graph.py", line 89, in register_dependency
    container = dependency.load(self.path_lookup)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/graph.py", line 327, in load
    return self._loader.load_dependency(path_lookup, self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/ucx/source_code/notebooks/loaders.py", line 66, in load_dependency
    content = absolute_path.read_text("utf-8")
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.11/site-packages/databricks/labs/blueprint/paths.py", line 854, in read_text
    return f.read()
           ^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 42: invalid start byte