dagster-io / fake-star-detector

https://github.com/dagster-io/dagster
234 stars 19 forks source link

[Issue] sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file #2

Open wentaoxu415 opened 1 year ago

wentaoxu415 commented 1 year ago

Hi Dagster team,

Congratulations on launching this blog post! I really liked the creativity behind the analysis and went ahead to try the simple model analysis against various other sample repos. However, I ran into some issues so I wanted to file this issue and see if there are steps that we can take to overcome it.

Issue Whenever I select repos with over 200 stars, the Dagster run fails at the stargazers_with_user_info step because of sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file error.

I suspect that when the Github API request takes some time, the database connection might be getting dropped which prevents Dagster from writing debug events and trigger this issue.

Error Message

dagster._core.errors.DagsterSubprocessError: During multiprocess execution errors occurred in child processes:
In process 14923: sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
[SQL: INSERT INTO event_logs (run_id, event, dagster_event_type, timestamp, step_key, asset_key, partition) VALUES (?, ?, ?, ?, ?, ?, ?)]
[parameters: ('0e9bd8d5-f9c9-47d1-990d-713527429ab1', '{"__class__": "EventLogEntry", "dagster_event": {"__class__": "DagsterEvent", "event_specific_data": {"__class__": "StepFailureData", "error": {"__cl ... (8337 characters truncated) ... "step_key": "stargazers_with_user_info", "timestamp": 1679177337.238413, "user_message": "Execution of step \\"stargazers_with_user_info\\" failed."}', 'STEP_FAILURE', '2023-03-18 22:08:57.238413', 'stargazers_with_user_info', None, None)]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Stack Trace:
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/executor/child_process_executor.py", line 79, in _execute_command_in_child_process
for step_event in command.execute():
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/executor/multiprocess.py", line 93, in execute
yield from execute_plan_iterator(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/api.py", line 1101, in __iter__
yield from self.iterator(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_plan.py", line 114, in inner_plan_execution_iterator
for step_event in check.generator(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_plan.py", line 358, in dagster_event_sequence_for_step
yield step_failure_event_from_exc_info(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/objects.py", line 124, in step_failure_event_from_exc_info
return DagsterEvent.step_failure_event(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/events/__init__.py", line 802, in step_failure_event
return DagsterEvent.from_step(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/events/__init__.py", line 413, in from_step
log_step_event(step_context, event)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/events/__init__.py", line 292, in log_step_event
step_context.log.log_dagster_event(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/log_manager.py", line 405, in log_dagster_event
self.log(level=level, msg=msg, extra={DAGSTER_META_KEY: dagster_event})
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/log_manager.py", line 420, in log
self._log(level, msg, args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 1624, in _log
self.handle(record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 1634, in handle
self.callHandlers(record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 1696, in callHandlers
hdlr.handle(record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 968, in handle
self.emit(record)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/log_manager.py", line 286, in emit
handler.handle(dagster_record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 968, in handle
self.emit(record)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 187, in emit
self._instance.handle_new_event(event)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1839, in handle_new_event
self._event_storage.store_event(event)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/storage/event_log/sqlite/sqlite_event_log.py", line 243, in store_event
conn.execute(insert_event_statement)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1380, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
return connection._execute_clauseelement(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement
ret = self._execute_context(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
self._handle_dbapi_exception(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception
util.raise_(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
self.dialect.do_execute(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)

The above exception was caused by the following exception:
sqlite3.OperationalError: unable to open database file

Stack Trace:
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
self.dialect.do_execute(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)

The above exception occurred during handling of the following exception:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
[SQL: INSERT INTO event_logs (run_id, event, dagster_event_type, timestamp, step_key, asset_key, partition) VALUES (?, ?, ?, ?, ?, ?, ?)]
[parameters: ('0e9bd8d5-f9c9-47d1-990d-713527429ab1', '{"__class__": "EventLogEntry", "dagster_event": {"__class__": "DagsterEvent", "event_specific_data": {"__class__": "StepOutputData", "metadata_entrie ... (15545 characters truncated) ... rgazers_with_user_info", "timestamp": 1679177337.235272, "user_message": "Yielded output \\"result\\" of type \\"DataFrame\\". (Type check passed)."}', 'STEP_OUTPUT', '2023-03-18 22:08:57.235272', 'stargazers_with_user_info', None, None)]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

Stack Trace:
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_plan.py", line 269, in dagster_event_sequence_for_step
for step_event in check.generator(step_events):
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 384, in core_dagster_event_sequence_for_step
for evt in _type_check_and_store_output(step_context, user_event):
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 434, in _type_check_and_store_output
for output_event in _type_check_output(step_context, step_output_handle, output, version):
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/plan/execute_step.py", line 287, in _type_check_output
yield DagsterEvent.step_output_event(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/events/__init__.py", line 771, in step_output_event
return DagsterEvent.from_step(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/events/__init__.py", line 413, in from_step
log_step_event(step_context, event)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/events/__init__.py", line 292, in log_step_event
step_context.log.log_dagster_event(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/log_manager.py", line 405, in log_dagster_event
self.log(level=level, msg=msg, extra={DAGSTER_META_KEY: dagster_event})
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/log_manager.py", line 420, in log
self._log(level, msg, args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 1624, in _log
self.handle(record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 1634, in handle
self.callHandlers(record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 1696, in callHandlers
hdlr.handle(record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 968, in handle
self.emit(record)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/log_manager.py", line 286, in emit
handler.handle(dagster_record)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/logging/__init__.py", line 968, in handle
self.emit(record)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 187, in emit
self._instance.handle_new_event(event)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/instance/__init__.py", line 1839, in handle_new_event
self._event_storage.store_event(event)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/storage/event_log/sqlite/sqlite_event_log.py", line 243, in store_event
conn.execute(insert_event_statement)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1380, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
return connection._execute_clauseelement(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1572, in _execute_clauseelement
ret = self._execute_context(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
self._handle_dbapi_exception(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception
util.raise_(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
self.dialect.do_execute(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)

The above exception was caused by the following exception:
sqlite3.OperationalError: unable to open database file

Stack Trace:
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
self.dialect.do_execute(
File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)

  File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/execution/api.py", line 990, in pipeline_execution_iterator
    for event in pipeline_context.executor.execute(pipeline_context, execution_plan):
  File "/Users/wentaoxu/Development/labs/fake-star-detector/venv/lib/python3.10/site-packages/dagster/_core/executor/multiprocess.py", line 306, in execute
    raise DagsterSubprocessError(

Reproduction Steps Specify any repo with more than a few hundred number of stars (> 200-300 stars) in the ops config.

rachfop commented 1 year ago

The README has a troubleshooting section.:

If you are using the default Dagster storage backed by SQLite, you may encounter an error as:

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file To get your pipeline successfully running, you can Shift+click "Materialize all" on the asset graph page and add the following configuration. This will turn off the default multiprocessing execution.

execution:
  config:
    in_process: null

And I did that configuraiton and I'm seeing the same error as above. Edit: Seems to fail after about 200 tokens for me.

yuhan commented 1 year ago

Hi! Thanks for opening this issue, and trying out this example!

The readme explains a workaround that may still not fit larger repository (i.e. longer running step).

The long-term solution is to switch to use a more performant storage, such as the built-in Postgres storage. To do that, you'll need to:

  1. install dagster-postgres
  2. make sure you have DAGSTER_HOME/dagster.yaml present. if not, you can set the environment variable DAGSTER_HOME to be where you want to store Dagster instance information. (when it's not set, Dagster tools will use a temp directory for storage that will be cleaned up when the process exists). More details in: https://docs.dagster.io/deployment/dagster-instance#default-local-behavior
  3. configure storage in your DAGSTER_HOME/dagster.yaml to switch to use postgres instead of the default sqlite. here's the doc: https://docs.dagster.io/deployment/dagster-instance#dagster-storage