catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
466 stars 107 forks source link

pudl etl crashes on `ferc1_solo_test.yml` #1448

Closed charmoniumQ closed 2 years ago

charmoniumQ commented 2 years ago

Describe the bug

pudl etl crashes on ferc1_solo_test.yml

Bug Severity

How badly is this bug affecting you?

To Reproduce

$ pudl_etl settings/ferc1_solo_test.yml --clobber
# see http://dpaste.com/EZN4SRVZ3
Traceback (most recent call last):
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: FOREIGN KEY constraint failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/pudl/etl.py", line 399, in etl
    pudl.load.sqlite.dfs_to_sqlite(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/pudl/load/sqlite.py", line 57, in dfs_to_sqlite
    md.drop_all(engine)
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 4814, in drop_all
    bind._run_ddl_visitor(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 3117, in _run_ddl_visitor
    conn._run_ddl_visitor(visitorcallable, element, **kwargs)
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2113, in _run_ddl_visitor
    visitorcallable(self.dialect, self, **kwargs).traverse_single(element)
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 524, in traverse_single
    return meth(obj, **kw)
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/sql/ddl.py", line 1023, in visit_metadata
    self.traverse_single(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 524, in traverse_single
    return meth(obj, **kw)
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/sql/ddl.py", line 1100, in visit_table
    self.connection.execute(DropTable(table))
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1289, in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/sql/ddl.py", line 80, in _execute_on_connection
    return connection._execute_ddl(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1381, in _execute_ddl
    ret = self._execute_context(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1845, in _execute_context
    self._handle_dbapi_exception(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2026, in _handle_dbapi_exception
    util.raise_(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/sam/.local/share/virtualenvs/pudl-gKD-Mnhe/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) FOREIGN KEY constraint failed
[SQL: 
DROP TABLE plants_ferc1]
(Background on this error at: https://sqlalche.me/e/14/gkpj)

Is ferc1_solo_test.yml as well used as etl_fast.yml?

Software Environment?

zaneselvans commented 2 years ago

Hmm, the ferc1_solo_test.yml settings are really only meant to be used by the integration tests, but I'm not sure why it would be failing here. Try running:

tox -e ferc1_solo

which is the test case that settings file is used in. If it passes, then everything is probably fine.

If I had to guess, I'd guess that there's an incompatibility between the schema of the FERC 1 Solo version of the PUDL DB, and the SQLAlchemy metadata object that's being used to try and clean up the existing database (and/or the existing database itself).

zaneselvans commented 2 years ago

(However, note that tox and the test suite are only available if you clone the PUDL repository -- they won't be there if you've just installed catalystcoop.pudl with pip or conda)

charmoniumQ commented 2 years ago

tox can't find that environment, despite this line in the tox.ini: [testenv:ferc1_solo]. I'm not quite sure how tox works here.

ERROR: unknown environment 'ferc1_solo'

If this isn't intended to work outside of the integration tests and tox environment, I will close this issue.