codeforkjeff / dbt-sqlite

A SQLite adapter plugin for dbt (data build tool)
Apache License 2.0
78 stars 13 forks source link

Errors starting #42

Closed cboden closed 1 year ago

cboden commented 1 year ago

I'm sure this is a weird issue...dbt started failing this morning on multiple independent machines.

With dbt-sqlite I have a cron job that runs dbt run every hour and has been for months. This morning it started failing. I started debugging and it's gotten weird...when I hit a wall, I opened up the project on another machine that had a backup from yesterday. When running dbt run I got the same error. dbt run was run on this version of the database multiple times since yesterday without issue on the original machine. My working theory here was there was some newly ingested data causing issues, but this seems to have ruled that out.

I haven't changed anything in days...no source code, no models, no binaries, no system update, no system reboot, etc. It's been on a server running in the background. The only thing I changed, after the error started happening I updated the config to not report metrics back to dbt in hopes a network problem was causing it, but that didn't help.

Here's the output of dbt debug:

16:37:56  Running with dbt=1.1.3
dbt version: 1.1.3
python version: 3.7.16
python path: /Users/cboden/.local/pipx/venvs/dbt-sqlite/bin/python
os info: Darwin-18.7.0-x86_64-i386-64bit
Using profiles.yml file at /Users/cboden/.dbt/profiles.yml
Using dbt_project.yml file at /Users/cboden/redacted/path/to/project/dbt/dbt_project.yml

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  database: database
  schema: main
  schemas_and_paths: {'main': '/Users/cboden/redacted/path/to/project/project-database.db'}
  schema_directory: /Users/cboden/redacted/path/to/project
  Connection test: [ERROR]

1 check failed:
dbt was unable to connect to the specified database.
The database returned the following error:

  >Runtime Error
  Database Error
    malformed database schema (my_redacted_model_name) - view "my_redacted_model_name" cannot reference objects in database main

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile

Here is the output from dbt.log while running dbt debug:

============================== 2023-03-30 16:37:56.839790 | decf6338-ceb0-41d7-a70e-b35f5c98e6d3 ==============================
16:37:56.839808 [info ] [MainThread]: Running with dbt=1.1.3
16:37:56.840766 [debug] [MainThread]: running dbt with arguments {'write_json': True, 'use_colors': True, 'printer_width': 80, 'version_check': True, 'partial_parse': True, 'static_parser': True, 'profiles_dir': '/Users/cboden/.dbt', 'send_anonymous_usage_stats': False, 'event_buffer_size': 100000, 'quiet': False, 'no_print': False, 'cache_selected_only': False, 'config_dir': False, 'which': 'debug', 'indirect_selection': 'eager'}
16:37:56.841150 [debug] [MainThread]: Tracking: do not track
16:37:56.946340 [debug] [MainThread]: Executing "git --help"
16:37:56.957930 [debug] [MainThread]: STDOUT: "b"usage: git [--version] [--help] [-C <path>] [-c <name>=<value>]\n           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]\n           [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare]\n           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]\n           <command> [<args>]\n\nThese are common Git commands used in various situations:\n\nstart a working area (see also: git help tutorial)\n   clone      Clone a repository into a new directory\n   init       Create an empty Git repository or reinitialize an existing one\n\nwork on the current change (see also: git help everyday)\n   add        Add file contents to the index\n   mv         Move or rename a file, a directory, or a symlink\n   reset      Reset current HEAD to the specified state\n   rm         Remove files from the working tree and from the index\n\nexamine the history and state (see also: git help revisions)\n   bisect     Use binary search to find the commit that introduced a bug\n   grep       Print lines matching a pattern\n   log        Show commit logs\n   show       Show various types of objects\n   status     Show the working tree status\n\ngrow, mark and tweak your common history\n   branch     List, create, or delete branches\n   checkout    Switch branches or restore working tree files\n   commit     Record changes to the repository\n   diff       Show changes between commits, commit and working tree, etc\n   merge      Join two or more development histories together\n   rebase     Reapply commits on top of another base tip\n   tag        Create, list, delete or verify a tag object signed with GPG\n\ncollaborate (see also: git help workflows)\n   fetch      Download objects and refs from another repository\n   pull       Fetch from and integrate with another repository or a local branch\n   push       Update remote refs along with associated objects\n\n'git help -a' and 'git help -g' list available subcommands and some\nconcept guides. See 'git help <command>' or 'git help <concept>'\nto read about a specific subcommand or concept.\n""
16:37:56.958720 [debug] [MainThread]: STDERR: "b''"
16:37:56.962706 [debug] [MainThread]: Acquiring new sqlite connection "debug"
16:37:56.964009 [debug] [MainThread]: Using sqlite connection "debug"
16:37:56.964372 [debug] [MainThread]: On debug: select 1 as id
16:37:56.964754 [debug] [MainThread]: Opening a new connection, currently in state init
16:37:56.970526 [debug] [MainThread]: On debug: No close available on handle
16:37:56.971823 [debug] [MainThread]: Connection 'debug' was properly closed.

I'm bewildered and don't know where to go from here. Any help would be greatly appreciated.

codeforkjeff commented 1 year ago

The "malformed database schema" error seems to be happening when connecting to the database file. Can you try using a SQL client program to connect to it and see if you get the same error? You can try the sqlite3 CLI or a program like DBeaver, DataGrip, TablePlus, etc. If you get the same error, it suggests your database file might be corrupted and in need of repair. (As always, make backups of everything before any repairs)

Along the same lines, if you have backups of the sqlite db file from prior to this error, you can try seeing if the job runs with that old copy.

Another possibility is some kind of version mismatch between your system's SQLite library (if it was upgraded as part of an OS update or something) and the version of the file.

cboden commented 1 year ago
codeforkjeff commented 1 year ago

Can you try dropping that view in the error message and seeing if dbt will at least successfully establish a connection to the db?

cboden commented 1 year ago

I dropped all views, same error

codeforkjeff commented 1 year ago

Bizarre. Try renaming the database file so that dbt-sqlite creates a brand new one, and see what happens.

Otherwise, I'm out of troubleshooting idea, sorry. I can't think of anything in dbt-sqlite that would suddenly fail in this way. If you do figure it out, please let me know!

cboden commented 1 year ago

Same error ☹️

codeforkjeff commented 1 year ago

Whoa. I would double check all your paths, and any nfs/samba mounts, and make sure your dbt config is pointing to the files you think they're pointing to.

cboden commented 1 year ago

No networking magic. Failure on macOS 13.2 and OS X 10.14.6. I double checked all paths (copy/pasted) values, all present (expected, as I changed nothing).

I ran the following commands as well: dbt clean && dbt deps && dbt debug. debug still resulted in same error. dbt parse gives same error.

cboden commented 1 year ago

Figured it out. I accidentally backed up the database file in the same directory on both machines 🤦‍♂️. I think dbt-sqlite was loading all db files in the schema_directory as schemas and got mad that some tables were the same name. My bad. Thank you for your prompt replies in assisting my troubleshooting!

codeforkjeff commented 1 year ago

Ahh, I forgot about that behavior of loading all the files in schema_directory. Good catch! Relieved you figured this out.