dani-garcia / vaultwarden

Unofficial Bitwarden compatible server written in Rust, formerly known as bitwarden_rs
GNU Affero General Public License v3.0
39k stars 1.89k forks source link

Intermittently all items in VW fail to load with 500 Internal Server Error and 'Error loading ciphers: DatabaseError(Unknown, "cache entry already complete")' #4125

Closed ccben87 closed 11 months ago

ccben87 commented 11 months ago

Subject of the issue

Intermittently all items in Vault do not display and browser receives 500 Internal Server Error. Logs indicate "[panic][ERROR] thread 'rocket-worker-thread' panicked at 'Error loading ciphers: DatabaseError(Unknown, "cache entry already complete")'"

Logging out and in a few times eventually causes passwords to load and once loaded it continues to work. Multiple users have had this issue.

Deployment environment

Your environment (Generated via diagnostics page)

Config (Generated via diagnostics page)

Show Running Config **Environment settings which are overridden:** ```json { "_duo_akey": null, "_enable_duo": true, "_enable_email_2fa": true, "_enable_smtp": true, "_enable_yubico": true, "_icon_service_csp": "", "_icon_service_url": "", "_ip_header_enabled": true, "_smtp_img_src": "cid:", "admin_ratelimit_max_burst": 3, "admin_ratelimit_seconds": 300, "admin_session_lifetime": 20, "admin_token": "***", "allowed_iframe_ancestors": "", "attachments_folder": "data/attachments", "auth_request_purge_schedule": "30 * * * * *", "authenticator_disable_time_drift": false, "data_folder": "data", "database_conn_init": "", "database_max_conns": 10, "database_timeout": 30, "database_url": "**********://********************************************************************************************************************************************", "db_connection_retries": 15, "disable_2fa_remember": true, "disable_admin_token": false, "disable_icon_download": false, "domain": "*****://**********************", "domain_origin": "*****://**********************", "domain_path": "", "domain_set": true, "duo_host": null, "duo_ikey": null, "duo_skey": null, "email_attempts_limit": 3, "email_change_allowed": true, "email_expiration_time": 600, "email_token_size": 6, "emergency_access_allowed": true, "emergency_notification_reminder_schedule": "0 3 * * * *", "emergency_request_timeout_schedule": "0 7 * * * *", "enable_db_wal": true, "event_cleanup_schedule": "0 10 0 * * *", "events_days_retain": null, "extended_logging": true, "helo_name": null, "hibp_api_key": null, "icon_blacklist_non_global_ips": true, "icon_blacklist_regex": null, "icon_cache_folder": "data/icon_cache", "icon_cache_negttl": 259200, "icon_cache_ttl": 2592000, "icon_download_timeout": 10, "icon_redirect_code": 302, "icon_service": "internal", "incomplete_2fa_schedule": "30 * * * * *", "incomplete_2fa_time_limit": 3, "invitation_expiration_hours": 120, "invitation_org_name": "Vaultwarden", "invitations_allowed": true, "ip_header": "X-Real-IP", "job_poll_interval_ms": 30000, "log_file": null, "log_level": "Info", "log_timestamp_format": "%Y-%m-%d %H:%M:%S.%3f", "login_ratelimit_max_burst": 10, "login_ratelimit_seconds": 60, "org_attachment_limit": 0, "org_creation_users": "***********************", "org_events_enabled": true, "org_groups_enabled": true, "password_hints_allowed": true, "password_iterations": 600000, "push_enabled": false, "push_installation_id": "***", "push_installation_key": "***", "push_relay_uri": "https://push.bitwarden.com", "reload_templates": false, "require_device_email": false, "rsa_key_filename": "data/rsa_key", "send_purge_schedule": "0 5 * * * *", "sendmail_command": null, "sends_allowed": true, "sends_folder": "data/sends", "show_password_hint": false, "signups_allowed": false, "signups_domains_whitelist": "", "signups_verify": true, "signups_verify_resend_limit": 6, "signups_verify_resend_time": 3600, "smtp_accept_invalid_certs": false, "smtp_accept_invalid_hostnames": false, "smtp_auth_mechanism": null, "smtp_debug": false, "smtp_embed_images": true, "smtp_explicit_tls": null, "smtp_from": "****************************", "smtp_from_name": "Vaultwarden", "smtp_host": "***************************************", "smtp_password": "***", "smtp_port": 465, "smtp_security": "force_tls", "smtp_ssl": null, "smtp_timeout": 15, "smtp_username": "********************", "templates_folder": "data/templates", "tmp_folder": "data/tmp", "trash_auto_delete_days": null, "trash_purge_schedule": "0 5 0 * * *", "use_sendmail": false, "use_syslog": false, "user_attachment_limit": 0, "web_vault_enabled": true, "web_vault_folder": "web-vault/", "websocket_address": "0.0.0.0", "websocket_enabled": false, "websocket_port": 3012, "yubico_client_id": null, "yubico_secret_key": null, "yubico_server": null } ```

Steps to reproduce

Unsure how to reproduce. This has just started out of nowhere after using Vaultwarden for some time. Possibly somehow DB has become corrupted?

Expected behaviour

Web Vault should always load passwords. DB errors are occurring.

Actual behaviour

Sometimes when signing into Web Vault the passwords don't display and the browser receives a 500 Internal Server Error (logs indicate a database error). Otherwise clicking about the various pages works but no passwords are listed. After signing in and out a few times, the experience returns to normal.

Troubleshooting data

<!-- Share any log files, screenshots, or other relevant troubleshooting data -->
[2023-12-04 03:57:11.257][request][INFO] GET /api/sync?excludeDomains=true
[2023-12-04 03:57:11.269][panic][ERROR] thread 'rocket-worker-thread' panicked at 'Error loading ciphers: DatabaseError(Unknown, "cache entry already complete")': src/db/models/cipher.rs:638
   0: vaultwarden::init_logging::{{closure}}
   1: std::panicking::rust_panic_with_hook
   2: std::panicking::begin_panic_handler::{{closure}}
   3: std::sys_common::backtrace::__rust_end_short_backtrace
   4: rust_begin_unwind
   5: core::panicking::panic_fmt
   6: core::result::unwrap_failed
   7: vaultwarden::db::models::cipher::Cipher::find_by_user::{{closure}}::{{closure}}
   8: tokio::runtime::scheduler::multi_thread::worker::block_in_place
   9: vaultwarden::db::models::cipher::Cipher::find_by_user_visible::{{closure}}
  10: vaultwarden::api::core::ciphers::sync::{{closure}}
  11: vaultwarden::api::core::ciphers::sync::into_info::monomorphized_function::{{closure}}
  12: rocket::server::<impl rocket::rocket::Rocket<rocket::phase::Orbit>>::route::{{closure}}
  13: rocket::server::hyper_service_fn::{{closure}}::{{closure}}
  14: tokio::runtime::task::core::Core<T,S>::poll
  15: tokio::runtime::task::harness::Harness<T,S>::poll
  16: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  17: tokio::runtime::scheduler::multi_thread::worker::run
  18: tokio::runtime::task::core::Core<T,S>::poll
  19: tokio::runtime::task::harness::Harness<T,S>::poll
  20: std::sys_common::backtrace::__rust_begin_short_backtrace
  21: core::ops::function::FnOnce::call_once{{vtable.shim}}
  22: std::sys::unix::thread::Thread::new::thread_start

[2023-12-04 03:57:11.275][response][INFO] (sync) GET /api/sync?<data..> => 500 Internal Server Error
ccben87 commented 11 months ago

Really not sure what's going on here. Found this page discussing the error and sounds like it should have been fixed in Postgres long ago: https://postgrespro.com/list/thread-id/2552153

SQL server log:

{"level":"info","ts":"2023-12-04T05:15:18Z","logger":"postgres","msg":"record","logging_pod":"postgres-vaultwarden-1","record":{"log_time":"2023-12-04 05:15:18.936 UTC","user_name":"app","database_name":"app","process_id":"111872","connection_from":"192.168.210.194:60206","session_id":"656d5fb8.1b500","session_line_num":"1","command_tag":"SELECT","session_start_time":"2023-12-04 05:12:24 UTC","virtual_transaction_id":"14/9926","transaction_id":"0","error_severity":"ERROR","sql_state_code":"XX000","message":"cache entry already complete","query":"SELECT DISTINCT \"ciphers\".\"uuid\", \"ciphers\".\"created_at\", \"ciphers\".\"updated_at\", \"ciphers\".\"user_uuid\", \"ciphers\".\"organization_uuid\", \"ciphers\".\"key\", \"ciphers\".\"atype\", \"ciphers\".\"name\", \"ciphers\".\"notes\", \"ciphers\".\"fields\", \"ciphers\".\"data\", \"ciphers\".\"password_history\", \"ciphers\".\"deleted_at\", \"ciphers\".\"reprompt\" FROM ((((((\"ciphers\" LEFT OUTER JOIN \"ciphers_collections\" ON (\"ciphers\".\"uuid\" = \"ciphers_collections\".\"cipher_uuid\")) LEFT OUTER JOIN \"users_organizations\" ON (((\"ciphers\".\"organization_uuid\" = \"users_organizations\".\"org_uuid\") AND (\"users_organizations\".\"user_uuid\" = $1)) AND (\"users_organizations\".\"status\" = $2))) LEFT OUTER JOIN \"users_collections\" ON ((\"ciphers_collections\".\"collection_uuid\" = \"users_collections\".\"collection_uuid\") AND (\"users_organizations\".\"user_uuid\" = \"users_collections\".\"user_uuid\"))) LEFT OUTER JOIN \"groups_users\" ON (\"groups_users\".\"users_organizations_uuid\" = \"users_organizations\".\"uuid\")) LEFT OUTER JOIN \"groups\" ON (\"groups\".\"uuid\" = \"groups_users\".\"groups_uuid\")) LEFT OUTER JOIN \"collections_groups\" ON ((\"collections_groups\".\"collections_uuid\" = \"ciphers_collections\".\"collection_uuid\") AND (\"collections_groups\".\"groups_uuid\" = \"groups\".\"uuid\"))) WHERE (((((\"ciphers\".\"user_uuid\" = $3) OR (\"users_organizations\".\"access_all\" = $4)) OR (\"users_collections\".\"user_uuid\" = $5)) OR (\"groups\".\"access_all\" = $6)) OR (\"collections_groups\".\"collections_uuid\" IS NOT NULL))","backend_type":"client backend","query_id":"0"}}

BlackDex commented 11 months ago

I suggest to turn off query result caching at the PostgreSQL config. This is not something we can fix on our side i think.

ccben87 commented 11 months ago

Are you able to please advise what the Postgres setting for that would be? Happy to turn it off but not really sure which setting that would be.

BlackDex commented 11 months ago

I have no idea my self. This might help https://wiki.postgresql.org/wiki/CachingResultspgPool

But it could also be an issue with either your storage or memory of the host. Not sure if you have checked the database for corruption?

Maybe this helps https://wiki.postgresql.org/wiki/Corruption

I'm no PostgreSQL expert, so i have no clue either here. I only know this not an error coming from Vaultwarden but from the database it self.

ccben87 commented 11 months ago

No idea what caused it but it seems to have been a corruption in our database and I can't replicate it after restoring from backup. Thanks for your suggestions. Closing this off.

ccben87 commented 11 months ago

No idea what caused it but it seems to have been a corruption in our database and I can't replicate it after restoring from backup. Thanks for your suggestions. Closing this off.