catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
458 stars 105 forks source link

Nightly Build Failure 2024-02-05 #3356

Closed zaneselvans closed 5 months ago

zaneselvans commented 5 months ago

Overview

Another spurious BadZipFile error, but this time on FERC-714, not PHMSA.

Next steps

What next steps do we need to do to understand or remediate the issue?

Verify that everything is fixed!

Once you've applied any necessary fixes, make sure that the nightly build outputs are all in their right places.

- [ ] [S3 distribution bucket](https://s3.console.aws.amazon.com/s3/buckets/pudl.catalyst.coop?region=us-west-2&bucketType=general&prefix=nightly/&showversions=false) was updated at the expected time
- [ ] [GCP distribution bucket](https://console.cloud.google.com/storage/browser/pudl.catalyst.coop/nightly;tab=objects?project=catalyst-cooperative-pudl) was updated at the expected time
- [ ] [GCP internal bucket](https://console.cloud.google.com/storage/browser/builds.catalyst.coop) was updated at the expected time
- [ ] [Datasette PUDL version](https://data.catalyst.coop/pudl/core_pudl__codes_datasources) points at the same hash as [nightly](https://github.com/catalyst-cooperative/pudl/tree/nightly)
- [ ] [Zenodo sandbox record](https://sandbox.zenodo.org/doi/10.5072/zenodo.5563) was updated to the record number in the logs (search for `zenodo_data_release.py` and `Draft` in the logs, to see what the new record number should be!)

Relevant logs

link to build logs from internal distribution bucket

(@jdangerx what do you mean here by "link to build logs"?)

Traceback (most recent call last):
  File "/home/mambauser/env/bin/pudl_etl", line 8, in <module>
    sys.exit(pudl_etl())
             ^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/pudl/src/pudl/etl/cli.py", line 184, in pudl_etl
    raise Exception(event.event_specific_data.error)
Exception: dagster._core.errors.DagsterExecutionStepExecutionError: Error occurred while executing op "raw_ferc714__demand_forecast_pa":

Stack Trace:
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_plan.py", line 286, in dagster_event_sequence_for_step
    for step_event in check.generator(step_events):
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 487, in core_dagster_event_sequence_for_step
    for user_event in _step_output_error_checked_user_event_sequence(
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 169, in _step_output_error_checked_user_event_sequence
    for user_event in user_event_sequence:
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/execute_step.py", line 95, in _process_asset_results_to_events
    for user_event in user_event_sequence:
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/compute.py", line 212, in execute_core_compute
    for step_output in _yield_compute_results(step_context, inputs, compute_fn, compute_context):
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/compute.py", line 181, in _yield_compute_results
    for event in iterate_with_context(
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_utils/__init__.py", line 465, in iterate_with_context
    with context_fn():
  File "/home/mambauser/env/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/utils.py", line 84, in op_execution_error_boundary
    raise error_cls(

The above exception was caused by the following exception:
zipfile.BadZipFile: File is not a zip file

Stack Trace:
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/utils.py", line 54, in op_execution_error_boundary
    yield
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_utils/__init__.py", line 467, in iterate_with_context
    next_output = next(iterator)
                  ^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/compute_generator.py", line 131, in _coerce_op_compute_fn_to_iterator
    result = invoke_compute_fn(
             ^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/site-packages/dagster/_core/execution/plan/compute_generator.py", line 125, in invoke_compute_fn
    return fn(context, **args_to_pass) if context_arg_provided else fn(**args_to_pass)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/pudl/src/pudl/extract/ferc714.py", line 88, in _extract_raw_ferc714
    ds.get_zipfile_resource("ferc714", name="ferc714.zip").open(
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/pudl/src/pudl/workspace/datastore.py", line 413, in get_zipfile_resource
    return zipfile.ZipFile(io.BytesIO(self.get_unique_resource(dataset, **filters)))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mambauser/env/lib/python3.11/zipfile.py", line 1302, in __init__
    self._RealGetContents()
  File "/home/mambauser/env/lib/python3.11/zipfile.py", line 1369, in _RealGetContents
    raise BadZipFile("File is not a zip file")
jdangerx commented 5 months ago

The BadZipFile stuff is tracked by #3358 - let's close this nightly build failure thread.