wandb / weave

Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
https://wandb.me/weave
Apache License 2.0
659 stars 49 forks source link

DNM: fix: Clients now correctly quote/unquote ref parts #2145

Closed tssweeney closed 2 weeks ago

tssweeney commented 4 weeks ago

Summary: This PR hopefully 🤞 solves >90% of our remaining Ref/Naming issues - meaning users should be able to freely create entities/projects (as long as Gorilla accepts them) and name their objects/ops anything they want (baring a % symbol). Under the hood, I close a few remaining constraints / parsing issues on the server side and mostly modify the UI and Python clients to correctly read/write quoted refs.


High Level Improvements:

  1. Python Client will always produce valid refs (ie. it correctly quotes when needed)
    • Benefit to user: no more crashed python scripts when bad chars are in the name
    • Note: the one case of % in the name is still illegal. We sanitize on the python client for this specific case.
  2. Python Client is able to consume quoted refs (ie. it correctly unquotes when needed)
    • Benefit to user: able to correctly walk ref data edges with bad chars in the name
  3. UI Client will always produce valid refs (ie. it correctly encodes when needed)
    • Benefit to user: able to link safely when dealing with bad characters
  4. UI Client is able to consume encoded refs (ie. it correctly decodes when needed)
    • Benefit to user: able to correctly display links with bad characters

Smaller Details:

Companion PR: https://github.com/wandb/core/pull/23333


This PR finishes up the Ref work to ensure we support any ref the user throws at us.

Context: In a recent PR: https://github.com/wandb/weave/pull/2077, I implemented a more robust parser and checker to ensure Refs and Names conform to expectations. This stopped the "bleeding" where we were storing invalid Refs.

However, our two clients Python and UI had not been updated to reflect the new rules. In fact, the Python client was still creating bad refs, and the UI was overly restrictive with what it thought a ref could be. So double edge sword. As a result, users still hit crashes (either python write time or ui read time).

Moreover, our Ref parsing logic is a mess (especially in the UI) as it is spread across many locations & riddled with legacy junk.

A little terminology note: in PY, URL escaping is called quoting and in TS it is encoding. When used in this PR, they follow the language-specific convention, but logically mean the same concept.

Understanding Refs: See https://www.notion.so/wandbai/WIP-Robust-Refs-550bf06f4a244a1ab4fdd26133952e2c for more details.

The important part to understand is the ref format:

Calls:
    weave:///<entity>/<project>/call/<callid>[extra]
Ops:
    weave:///<entity>/<project>/op/<name>:<digest>[extra]
Objects:
    weave:///<entity>/<project>/object/<name>:<digest>[extra]
Tables:
    weave:///<entity>/<project>/table/<digest>[extra]
Where:
[extra] is an optional set of slash-seperated tuples of edges. for example:
    /key/key_name_1/index/0
They always follow the pattern of [edgeName/edgeValue], where 
`edgeName` is a shortlist of 4 specific edges names.

Furthermore, entity, project, name and the edgeValues are user-controlled.

The main constraints we need to parse these refs are:

  1. entity cannot contain slashes /
  2. project cannot contain slashes /
  3. name cannot contain slashes / or colons :
  4. edgeValue cannot contain slashes /

Therefore, we need to basically URL encode/decode each of these parts in all the places that they are used


Deployment The ideal order would be:

  1. The companion PR (land and deploy): https://github.com/wandb/core/pull/23333
  2. Secondly, split out the trace server changes from this PR (land and deploy)
  3. Then we can land & deploy/release UI and python changes in any order
  4. Make sure prod is good for few days
  5. Rollout to dedicated
circle-job-mirror[bot] commented 4 weeks ago

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=672d9cc574b1ef2a7f0e08b74e1374d110180feb