TBD54566975 / ftl

FTL - Towards a 𝝺-calculus for large-scale systems
https://tbd54566975.github.io/ftl/
Apache License 2.0
20 stars 7 forks source link

Replace `json.Decoder` in `go-runtime/encoding` with completely custom JSON decoder/encoder #1432

Open deniseli opened 4 months ago

deniseli commented 4 months ago

encoding.go relies heavily on json.Decoder (link) to decode a JSON blob into a reflect.Value.

Problems with the current approach:

1. json.Decoder does not support Peek() via its public interface. In order to read the next token in the JSON string, we need to call d.Token(), which pops a token off the buffer. At the same time, we rely on calling d.Decode(&value) to decode basic types, and Decode() itself pops the token that it decodes into value. Suppose you need to split off a branch case for handling nulls, and if not null, then call Decode(). How do you find out that the next token is null? If you call Token(), then you will have removed that token from the buffer, and Decode() won't be able to decode it. Currently, to work around this, we've built a dirty local implementation of Peek() that scans ahead for delimiters, but having this sit downstream of the decoder itself is very hacky and will incur maintenance cost.

2. We are mixing stdlib automagic with custom decoding logic and may go out of sync (technically we already are out of sync) The purpose of the go-runtime encoding package is to encode/decode FTL-supported types into/from JSON. We need custom logic for some FTL types (e.g. ftl.Option), so we definitely cannot rely purely on the std JSON tooling in Go. Now that we rely on both, we have an issue where the stdlib JSON encoding logic comes with assumptions that we break (e.g. we do not respect field naming from Go struct tags), and vice versus (our custom logic necessarily builds in assumptions that are inconsistent with stdlib's behavior). For example, our encoding package lowerCamelCases field keys, whereas stdlib's json takes the struct field name directly. Yet, we cannot tell FTL users to completely throw stdlib JSON assumptions out the window, since we do use it for some parts of our parsing. However, this leads to them trusting all the stdlib assumptions even when they shouldn't. For example, pfi sets JSON field names using Go tags (e.g. `json:"customname"`), but our custom encoding logic doesn't actually use the field names provided. No one seems to have noticed yet.

3. Is an iterative buffer approach really the best design pattern for managing encoding of a multitude of types? Switching to a more declarative approach (e.g. registering parsible types with handlers and classifiers) would scale better, especially as we consider type widening: https://github.com/TBD54566975/ftl/issues/1296

deniseli commented 4 months ago

We should discuss prioritization in triage