bacalhau-project / bacalhau

Compute over Data framework for public, transparent, and optionally verifiable computation
https://docs.bacalhau.org
Apache License 2.0
643 stars 85 forks source link

Epic: Client & Server Auto-Update and Notifications #3878

Open frrist opened 6 months ago

frrist commented 6 months ago

Description

There are several scenarios here, expressed in the following table:

Client Behind Serve Server Behind Client Both Behind Endpoint
Client Behavior Notify (only) user of need for client update Notify (only) user of need for server update; Allow skew via flag Notify (only) user of need for client update; Notify (only) user of need for server update; “Turn off all notifications on server and all clients that contact me”
Server Behavior Notify (only) user of update on contact; Allow skew via flag (server or client) Notify (only) user of need for server update; Notify SRE of skew in logs; Allow auto-update Notify (only) user of need for server update; Notify SRE of skew in logs; Allow auto-update

Acceptance Criteria

### Acceptance Criteria
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/237
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/236
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/238
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/239
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/240
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/241
- [ ] https://github.com/bacalhau-project/expanso-planning/issues/242
- [ ] https://github.com/bacalhau-project/bacalhau/issues/3877
- [ ] https://github.com/bacalhau-project/bacalhau/issues/3876
- [ ] https://github.com/bacalhau-project/bacalhau/issues/3875
Initial Draft At present users are presented with a text prompt in the CLI when the version of bacalhau installed on their machine does not match the version metadata hosted on http://update.bacalhau.org/version. The prompt directs users to manually run `curl -sL 'https://get.bacalhau.org/install.sh?dl=fac3c600-9660-5928-8b9b-8437f13bf9d0' | bash` on their machine in-order to update bacalhau, this is a manual process. The current flow presents several challenges to users deploying bacalhau through a cloud marketplace: - The version of bacalhau described at http://update.bacalhau.org/version will become out of date with the users deployed version on the marketplace. Their client will then receive the aforementioned update message. (they shouldn't follow this prompt because:) - If users update the version of bacalhau they are using locally and do not update the version of bacalhau deployed to the marketplace their client will fail to communicate with their hosted marketplace version due to version mismatch. - To update the version of bacalhau deployed from the marketplace users must ssh to each node, manually update the version and restart the service. This will not scale for users with many nodes. I believe there are several ways to address the challenge presented with the current work-flow: 1. Associate the client version with the version deployed in the marketplace for each user: Instead of a client polling `update.bacalhau.org/version` for version metadata it polls a URL specific to the marketplace deployment for update status. 2. Implement auto update functionality for bacalhau deployments via tools like https://github.com/minio/selfupdate or https://github.com/sanbornm/go-selfupdate. Users could trigger a fleet-wide update via an API command 3. Permit clients to operate with servers whose versions differ from their server. Related: https://github.com/bacalhau-project/bacalhau/issues/3163
aronchick commented 6 months ago

I think the most cheap and cheerful way to go about this (for now) is just to inform and allow a flag override:

That's P0.

P1 is implementing auto-updating (i think we'll need it anyway) - but it doesn't prevent the above.

frrist commented 6 months ago

You're connected with a client that differs from the server. This is not recommended! Please run 'curl .... | bash VERSION to download and run a version that syncs

One problem with this approach is the message isn't technically correct (and should NOT be followed) since the version metadata in http://update.bacalhau.org/version can differ from both the client and the server. For example:

  1. user deploys version A from marketplace.
  2. user downloads client version A.
  3. we release a new bacalhau version B.
  4. the client and server (deployment) are now both out of date with latest.
  5. client sees You're connected with a client that differs from the server. This is not recommended! message.

Step 5) is should not be followed by client since the serve and client are still on the same version, but the data being read from http://update.bacalhau.org/version is whats different, and is distinct from any deployments the user made in the marketplace.

Things are this way because the server isn't what tells the client its out of data, a separate service is what tells the client its out of date.

aronchick commented 6 months ago

Client doesn't see the "you're out of date with the server" message until it connects to a server.

That said, this is just screaming that we're going to need more tolerant versioning skew support soon.

We need a table laying out what the ux is for this for each situation - 3x3

frrist commented 6 months ago
# Client Behind Serve Server Behind Client Both Behind Endpoint
Client Behavior Notify (only) user of need for client update Notify (only) user of need for server update; Allow skew via flag Notify (only) user of need for client update; Notify (only) user of need for server update; “Turn off all notifications on server and all clients that contact me”
Server Behavior Notify (only) user of update on contact; Allow skew via flag (server or client) Notify (only) user of need for server update; Notify SRE of skew in logs; Allow auto-update Notify (only) user of need for server update; Notify SRE of skew in logs; Allow auto-update

Will update issue with this table and include acceptance criteria from our chat.

wdbaruni commented 2 months ago

@aronchick @frrist what is the status and priority of the remaining issues in this epic?

frrist commented 2 months ago

The work that remains is:

wdbaruni commented 2 months ago

That looks like a much bigger scope. Do you mind closing this issue and opening a new one for the self-update feature?