sebadob / rauthy

OpenID Connect Single Sign-On Identity & Access Management
https://sebadob.github.io/rauthy/
Apache License 2.0
361 stars 20 forks source link

Issue starting Rauthy 0.26.1: Error: failed to connect to sqlite #595

Closed andoks closed 1 month ago

andoks commented 1 month ago

Attempting to upgrade from v0.25.0 to v0.26.1 (to be able to use #591) I get an error during startup indicating Rauthy is unable to use the on disk sqlite file, and not able to create one from scratch.

When setting DATABASE_URL: "sqlite::memory:" it works, when setting DATABASE_URL: "sqlite:data/rauthy.db" it does not.

log ``` 88 ,d 88 88 88 8b,dPPYba, ,adPPYYba, 88 88 MM88MMM 88,dPPYba, 8b d8 88P' "Y8 "" `Y8 88 88 88 88P' "8a `8b d8' 88 ,adPPPPP88 88 88 88 88 88 `8b d8' 88 88, ,88 "8a, ,a88 88, 88 88 `8b,d8' 88 `"8bbdP"Y8 `"YbbdP'Y8 "Y888 88 88 Y88' d8' d8' {"timestamp":"2024-10-22T13:40:58.588939Z","level":"INFO","fields":{"message":"2024-10-22 13:40:58.588919508 UTC - Starting Rauthy v0.26.1"},"target":"rauthy"} {"timestamp":"2024-10-22T13:40:58.589021Z","level":"INFO","fields":{"message":"Log Level set to 'INFO'"},"target":"rauthy"} {"timestamp":"2024-10-22T13:40:58.589241Z","level":"INFO","fields":{"message":"Cache App running on Thread ThreadId(49)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589248Z","level":"INFO","fields":{"message":"Cache DeviceCode running on Thread ThreadId(48)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589278Z","level":"INFO","fields":{"message":"Cache AuthProviderCallback running on Thread ThreadId(48)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589257Z","level":"INFO","fields":{"message":"Cache AuthCode running on Thread ThreadId(47)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589284Z","level":"INFO","fields":{"message":"Cache ClientDynamic running on Thread ThreadId(49)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589304Z","level":"INFO","fields":{"message":"Cache Session running on Thread ThreadId(47)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589278Z","level":"INFO","fields":{"message":"get_initial_state","vote":"T0-N0:uncommitted","last_purged_log_id":"None","last_applied":"None","committed":"None","last_log_id":"None"},"target":"openraft::storage::helper"} {"timestamp":"2024-10-22T13:40:58.589313Z","level":"INFO","fields":{"message":"Cache PoW running on Thread ThreadId(47)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589314Z","level":"INFO","fields":{"message":"Cache User running on Thread ThreadId(49)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589286Z","level":"INFO","fields":{"message":"Cache ClientEphemeral running on Thread ThreadId(48)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589296Z","level":"INFO","fields":{"message":"Cache DPoPNonce running on Thread ThreadId(44)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589308Z","level":"INFO","fields":{"message":"Cache IPRateLimit running on Thread ThreadId(45)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589322Z","level":"INFO","fields":{"message":"Cache Webauthn running on Thread ThreadId(47)"},"target":"hiqlite::store::state_machine::memory::kv_handler"} {"timestamp":"2024-10-22T13:40:58.589319Z","level":"INFO","fields":{"message":"load membership from log: [0..0)"},"target":"openraft::storage::helper"} {"timestamp":"2024-10-22T13:40:58.589426Z","level":"INFO","fields":{"message":"load key log ids from (None,None]"},"target":"openraft::storage::helper"} {"timestamp":"2024-10-22T13:40:58.589501Z","level":"INFO","fields":{"message":"startup begin: state: RaftState { vote: UTime { data: Vote { leader_id: LeaderId { term: 0, node_id: 0 }, committed: false }, utime: Some(Instant { tv_sec: 108094, tv_nsec: 429511139 }) }, committed: None, purged_next: 0, log_ids: LogIdList { key_log_ids: [] }, membership_state: MembershipState { committed: EffectiveMembership { log_id: None, membership: Membership { configs: [], nodes: {} }, voter_ids: {} }, effective: EffectiveMembership { log_id: None, membership: Membership { configs: [], nodes: {} }, voter_ids: {} } }, snapshot_meta: SnapshotMeta { last_log_id: None, last_membership: StoredMembership { log_id: None, membership: Membership { configs: [], nodes: {} } }, snapshot_id: \"\" }, server_state: Learner, accepted: Accepted { leader_id: LeaderId { term: 0, node_id: 0 }, log_id: None }, io_state: IOState { building_snapshot: false, vote: Vote { leader_id: LeaderId { term: 0, node_id: 0 }, committed: false }, flushed: LogIOId { committed_leader_id: LeaderId { term: 0, node_id: 0 }, log_id: None }, applied: None, snapshot: None, purged: None }, purge_upto: None }, is_leader: false, is_voter: false"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589580Z","level":"INFO","fields":{"message":"startup done: id=1 target_state: Learner"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589628Z","level":"INFO","fields":{"message":"received RaftMsg::Initialize: openraft::core::raft_core::RaftCore<_, _, _, _>::handle_api_msg","members":"{1: Node { id: 1, addr_raft: \"localhost:8100\", addr_api: \"localhost:8200\" }}"},"target":"openraft::core::raft_core"} {"timestamp":"2024-10-22T13:40:58.589664Z","level":"INFO","fields":{"message":"openraft::engine::engine_impl::Engine<_>::elect, new candidate: {T1-N1:uncommitted@13:40:58.589654, last_log_id:T0-N0-0 progress:{1: false}}"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589763Z","level":"INFO","fields":{"message":"vote is changing from T0-N0:uncommitted to T1-N1:uncommitted"},"target":"openraft::engine::handler::vote_handler"} {"timestamp":"2024-10-22T13:40:58.589792Z","level":"INFO","fields":{"message":"received Notify::VoteResponse: openraft::core::raft_core::RaftCore<_, _, _, _>::handle_notify","now":"13:40:58.589791","resp":"{T1-N1:uncommitted, last_log:None}"},"target":"openraft::core::raft_core"} {"timestamp":"2024-10-22T13:40:58.589815Z","level":"INFO","fields":{"message":"openraft::engine::engine_impl::Engine<_>::handle_vote_resp","resp":"{T1-N1:uncommitted, last_log:None}","target":"1","my_vote":"T1-N1:uncommitted","my_last_log_id":"Some(T0-N0-0)"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589825Z","level":"INFO","fields":{"message":"openraft::proposer::candidate::Candidate<_, _>::grant_by","voting":"{T1-N1:uncommitted@13:40:58.589654, last_log_id:T0-N0-0 progress:{1: true}}"},"target":"openraft::proposer::candidate"} {"timestamp":"2024-10-22T13:40:58.589832Z","level":"INFO","fields":{"message":"a quorum granted my vote"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589837Z","level":"INFO","fields":{"message":"openraft::engine::engine_impl::Engine<_>::establish_leader"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589843Z","level":"INFO","fields":{"message":"vote is changing from T1-N1:uncommitted to T1-N1:committed"},"target":"openraft::engine::handler::vote_handler"} {"timestamp":"2024-10-22T13:40:58.589850Z","level":"INFO","fields":{"message":"become leader","id":"1"},"target":"openraft::engine::handler::server_state_handler"} {"timestamp":"2024-10-22T13:40:58.589856Z","level":"INFO","fields":{"message":"remove all replication"},"target":"openraft::core::raft_core"} {"timestamp":"2024-10-22T13:40:58.589874Z","level":"INFO","fields":{"message":"rpc internal listening on 0.0.0.0:8100"},"target":"hiqlite::start"} {"timestamp":"2024-10-22T13:40:58.589881Z","level":"INFO","fields":{"message":"received Notify::VoteResponse: openraft::core::raft_core::RaftCore<_, _, _, _>::handle_notify","now":"13:40:58.589881","resp":"{T1-N1:committed, last_log:None}"},"target":"openraft::core::raft_core"} {"timestamp":"2024-10-22T13:40:58.589890Z","level":"INFO","fields":{"message":"openraft::engine::engine_impl::Engine<_>::handle_vote_resp","resp":"{T1-N1:committed, last_log:None}","target":"1","my_vote":"T1-N1:committed","my_last_log_id":"Some(T1-N1-1)"},"target":"openraft::engine::engine_impl"} {"timestamp":"2024-10-22T13:40:58.589945Z","level":"INFO","fields":{"message":"api external listening on 0.0.0.0:8200"},"target":"hiqlite::start"} {"timestamp":"2024-10-22T13:40:58.590017Z","level":"INFO","fields":{"message":"cache raft is already initialized - skipping become_cluster_member()"},"target":"hiqlite::init"} {"timestamp":"2024-10-22T13:40:58.691427Z","level":"INFO","fields":{"message":"Client API WebSocket trying to connect to: http://localhost:8200/stream/cache"},"target":"hiqlite::client::stream"} {"timestamp":"2024-10-22T13:40:58.691433Z","level":"INFO","fields":{"message":"Listen URL: {http|https}://0.0.0.0:{8080|8443}"},"target":"rauthy_models::app_state"} {"timestamp":"2024-10-22T13:40:58.691498Z","level":"INFO","fields":{"message":"Public URL: localhost"},"target":"rauthy_models::app_state"} Error: failed to connect to sqlite Caused by: 0: error returned from database: (code: 1544) attempt to write a readonly database 1: (code: 1544) attempt to write a readonly database ```

It also seems like something changed between v0.25.0 and v0.26.0 with regards to the docker image, but after tweaking various permissions around for the data path, I don't think that necessarily is the issue.

$ docker inspect ghcr.io/sebadob/rauthy:0.26.0-lite | jq -r '.[].Config.User'
65532

$ docker inspect ghcr.io/sebadob/rauthy:0.25.0-lite | jq -r '.[].Config.User'
10001:10001
sebadob commented 1 month ago

This is most likely an issue with your setup / container config, because nothing has changed regarding the database between these versions.

I forgot to mention the user ID change in the docker image, yes that's true. Starting with v0.26, Rauthys needs to use distroless base images instead of scratch, and the default rootless version uses a different user. I have not though about this, because my setup worked out of the box without any issues. If you mange permissions manually, this would be an issue, right.

I will check the documentation if it needs updates.

Edit:

I just updated the release notes for v0.26.0 with the change in userID.

sebadob commented 1 month ago

As I am thinking about it, it may be preferrable to change back to the original user ID when building the container for all cases with manual mangement, like mostly docker.

andoks commented 1 month ago

Thanks for the pointers! I indeed juggle users and permissions manually, as I have a setup binary that I run together with rauthy, in addition to volumes and configs. I will look further into it tomorrow and see if I can find what I am doing wrong =)

sebadob commented 1 month ago

So, I just tested it manually with docker and I have no issues as soon as I am setting the correct access rights:

grafik

I am not sure if I would want to revert do keep the docs in the current state, or update the docs and use the default user for the rootless image, which is probably preferrable. I think about it and check this tomrrow.

sebadob commented 1 month ago

I had another look into it and yes, the rootless image seems to not set the correct access rights for the initial /app/data. That's why I have no issues when creating and mounting the volume myself (or let K8s manage permissions automatically), but it will error when starting without setting up permissions beforehand.

I will figure out why it does not set the access rights correctly and build a new image, probably a v0.26.2 then.

andoks commented 1 month ago

Great, thanks! I am still a bit puzzled how to get around this when not using kubernetes :sweat_smile:

sebadob commented 1 month ago

There is now v0.26.2 which reverts this change, so this should be working as expected for you.

sebadob commented 1 month ago

Great, thanks! I am still a bit puzzled how to get around this when not using kubernetes 😅

Yeah, the distroless:nonroot did some weird things with the access. Usually, I created and copied an empty directory from a builder image or the host so it does exist in the final image with the correct access rights. When you do a COPY, docker usually assigns the correct access for the configured USER. But it did not work this way for the rootless image.

So I decided to move back to the original user:group which makes more sense anyway, because then you don't need to change anything and it's non-root anyway already.

andoks commented 1 month ago

I found a work-around for it actually, although it is not particularily elegant :sweat_smile:

FROM ghcr.io/sebadob/rauthy:0.26.1-lite AS rauthy

FROM ghcr.io/sebadob/rauthy:0.26.1-lite                                                                                                                                                                            
ARG USER="65532:65532"
USER 65532:65532

COPY --from=rauthy --chown=$USER /app /app

So I decided to move back to the original user:group which makes more sense anyway, because then you don't need to change anything and it's non-root anyway already.

Either works for me, I can see why keeping it as in previous releases would be beneficial to not break anyone using the image, but also using the "standard"(?) nonroot user also has its benefits by, well, being more standard.

sebadob commented 1 month ago

(...) but also using the "standard"(?) nonroot user also has its benefits by, well, being more standard.

Well, this actually has a drawback as well. For instance, if you have multiple containers on your host that all use the nonroot user, they would be able to access each others data. So the best thing would actually be to have a unique user in each container and I have not seen anything else using 10001:10001 in the wild so far.

In the end, both have the same rights, which are none, and the 10001 inside the container has no shell as well.

andoks commented 1 month ago

Well, this actually has a drawback as well. For instance, if you have multiple containers on your host that all use the nonroot user, they would be able to access each others data. So the best thing would actually be to have a unique user in each container and I have not seen anything else using 10001:10001 in the wild so far.

Only if they share volumes I presume?

In the end, both have the same rights, which are none...

:+1:

... and the 10001 inside the container has no shell as well.

I really like no-shell images for prodsetting. Was quite quick to debug when using https://github.com/iximiuz/cdebug - adding that to my toolbelt now (since I do not have docker debug per today) :smile:

sebadob commented 1 month ago

I exclusivlely use no-shell images in production. There should not be any need to exec into a prod container imho.

Didn't know about cdebug, but I will have a look at it for sure, thanks!

sebadob commented 1 month ago

Only if they share volumes I presume?

Sorry, yes, but also if someone has access to the underlying host with all of these volumes.

andoks commented 1 month ago

Just wanted to confirm that the change you did for v0.26.2 made everything work correctly again :+1: