wI2L / jettison

Highly configurable, fast JSON encoder for Go
https://pkg.go.dev/github.com/wI2L/jettison
MIT License
174 stars 13 forks source link

AllowList appear to be propagated #8

Closed smyrman closed 1 year ago

smyrman commented 1 year ago

From the docs it seams that field selection through an AllowList, should only apply to first-level elements:

AllowList sets the list of first-level fields which are to be considered when encoding a struct. The fields are identified by the name that is used in the final JSON payload. See DenyFields documentation for more information regarding joint use with this option.

However, it looks like they are applied to nested struct elements as well:

https://go.dev/play/p/a89nZXNQ7HD

wI2L commented 1 year ago

There might be a misunderstanding on your part, or I am not understanding what you are pointing out.

In the example, the followings fields are allowed: "Z", "β", "Gamma", "π"

Does that make sense to you, or am I missing something ?

Regarding the doc, it indicates that you could not, for example, allow the Omega int json:"ω" field, if its parent (struct Z) field Z in struct X would not, because from the point of view of the marshaled object x of type struct X{}, the ω (omega) field is at level x.Z.Omega.

smyrman commented 1 year ago

Z is a struct, containing two fields, so it is marshaled, and the two fields appears in the result.

No -- actually only one field appear in the result.

So, to be more clear, here is the program output.

First line: Without allow list Marshal(x) gives:

{"Z":{"ω":42,"Z":3},"α":"1","β":"2","Gamma":"3","π":"4"}

Second line: With allow list "Z", "β", "Gamma", "π", Marshal(x) gives:

{"Z":{"Z":3},"β":"2","Gamma":"3","π":"4"}

While I had expected it to include "ω" in Z (as I expected, from the docs, that only top-level fields in X should be filtered):

{"Z":{"ω":42,"Z":3},"β":"2","Gamma":"3","π":"4"}

The example is inherited from the official example in the docs, which include the same code except the Z struct type does not embed a Z field.

smyrman commented 1 year ago

Here is a simpler example: https://go.dev/play/p/ZrXaMy0AJ9d

package main

import (
    "fmt"
    "log"

    "github.com/wI2L/jettison"
)

type X struct {
    F1 Z      `json:"f1"`
    F2 string `json:"f2"`
}

type Z struct {
    F1 string `json:"f1"`
    F2 string `json:"f2"`
}

func main() {
    x := X{
        F1: Z{
            F1: "include",
            F2: "include",
        },
        F2: "exclude",
    }
    b, err := jettison.MarshalOpts(x, jettison.AllowList([]string{"f1"}))
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(string(b))
}

Expect:

{"f1":{"f1":"include","f2":"include"}}

Got:

{"f1":{"f1":"include"}}

FWIW, I also tried with a m map type (map[string]any), but that actually appear to ignore the AllowList all together: https://go.dev/play/p/GpG1yD9hCC9

wI2L commented 1 year ago

No -- actually only one field appear in the result.

You're right, my bad, I misread the example and skipped the fact that the first line of the output is the result of the encoding with no options actually.

The function responsible for filtering the fields by their name based on the Deny/Allow lists is isDeniedField, called here, which currently doesn't care about field-level.

Regarding the feature and what the docs says, I went back to the commit history (to refresh my memory), and the behavior described in the doc was available in https://github.com/wI2L/jettison/commit/9cd1614f6a8180eedfeba0df4e4fabddad3f7d32, when it was only possible to whitelist fields with WithFields.

Later on, in a following commit (https://github.com/wI2L/jettison/commit/94e9fa2473eabc41905d1959d2e549227e3a5d14#diff-e66fb2b3e2ed56aa1614ff2ff129bfd9c48111d3ec9e222e196ec18022750eb4R201), I added the DenyList option, renamed WithFields to AllowList and reverted the behavior of filtering only first-level fields, to allow more flexibility.

So the doc is outdated in regard to those changes -- I will update it to remove the misleading part.


About your own usecase, is that actually something that is actually a blocker ?

Filtering by field name with no depth level indication (either numeric, or via a full JSON path) implicates that fields with duplicate names could not be differentiated with the current implem.

That's something that I considered, like using JSON paths for the allow/deny list to describe the fields to filter, but never implemented in regard to performance; while it would be cheap to resolve the paths for struct fields at any levels during the "prep" phase, it would not be possible for arrays and maps, since there is no way to know the slice/map index/key in advance.

smyrman commented 1 year ago

About your own usecase, is that actually something that is actually a blocker ?

No I am not blocked. I am not currently using this library. I was browsing for options, and came across this library.

smyrman commented 1 year ago

For a simple one-level only implementation (no json path support), you could inherit the encOptions (and remove the allowList/denyList) each time you encode attributes.

For this lib, that would mean doing something like this:

type encOpts struct {
    ctx         context.Context
    timeLayout  string
    durationFmt DurationFmt
    flags       bitmask
    allowList   stringSet  
    denyList    stringSet
}

func (opts encOpts) forAtttr(name string) encOpts {
    opts.allowList = nil // utilize that opts is not a pointer receiver.
    opts.denyList = nil // utilize that opts is not a pointer receiver.
    return opts
}

func (opts encOpts) forIndex(i int) encOpts {
    // do nothing for now.
    retur opts
}

All instances where you marshal object or array attributes, you must inherit the options. E.g. in encodeUnsortedMap, do:

for ; it.key != nil; mapiternext(it) {
        if !opts.allowAttr(it.key) {
            continue
        }
        if n != 0 {
            dst = append(dst, ',')
        }
        kOpts := opts.forAttr(it.key)

        // Encode entry's key.
        if dst, err = ki(it.key, dst, kOpts); err != nil {
            return dst, err
        }
        dst = append(dst, ':')

        // Encode entry's value.
        if dst, err = vi(it.val, dst, kOpts); err != nil {
            return dst, err
        }
        n++
    }

This ~extension~ design could be extended later to support json path matching by altering the forAttr and forIndex methods.

func (opts encOpts) forAttr(name string) encOpts {
    opts.allowList =  opts.allowList.forAttr(name) // returns a new filtered map or nil.
    opts.denyList = opts.denyList.forAttr(name)  // returns a new filtered map or nil.
    return opts
}

func (opts encOpts) forIndex(i int) encOpts {
    opts.allowList =  opts.allowList.forIndex(i)  // returns a new filtered map or nil.
    opts.denyList = opts.denyList.forIndex(i)  // returns a new filtered map or nil.
    return opts
smyrman commented 1 year ago

btw, I realize I didn't fully explain how the opts.allowList.forAttr(name) should work for the JSON path (or much simpler, path joined by dot) variant.

func (stringSet) forAttr(attr string) stringSet {
    next := make(stringSet)
    for k := range stringSet {
        prefix, nk, ok := strings.Cut(k, ".")
        if !ok {
            continue
        }
        if prefix != attr {
            continue
        }
        next[nk] = struct{}{} // e.g. if `attr` is "f1" and k is "f1.f2", then add "f2".
    } 
}

This means that for each time you encode the next level of your hierarchy, you call it with the options that apply to the next level only. Could also allow more advanced matching, such as "*.f2" -> "f2". etc.

wI2L commented 1 year ago

@smyrman

That looks like reasonable implem for simple dot-notation path matching. However if we want more complex matching, then JSONPatch should be used. Let's continue the discussion in the issue I cross-referenced.