huandu / go-clone

Clone any Go data structure deeply and thoroughly.
https://pkg.go.dev/github.com/huandu/go-clone
MIT License
305 stars 28 forks source link

CustomFunc should allow Ptr types as well #23

Open jayaprabhakar opened 6 months ago

jayaprabhakar commented 6 months ago

CustomFunc are helpful in creating custom clone functions. However, it doesn't work well with Ptrs. Because when adding *Struct, it gets resolved to Struct and similarly when looking up from the map, only the Struct is used.

There are some cases, when custom pointers would be helpful.

For example, in my case, under certain circumstances, I want the ptrs to reuse the same reference and sometimes not. I am working on a model checker, where the spec might say, multiple alternate paths a process can take, and we would want to explore both the paths. So, from the parent, I would create separate clone of entire variable spaces for each fork, but within each fork I would want to reuse the variable.

Marking it opaque pointer doesn't help, as I would want to actually clone in some cases. So, the ideal solution would be to make allocator.SetCustomFunc(...) to not dereference the Ptr type, and when checking ptrs, and use the custom func when present, before dereferencing the ptr

huandu commented 6 months ago

SetCustomFunc is designed to set custom clone function for a struct, not other types. To support custom function for any type may take too much impact on performance.

One possible way to implement the feature you mentioned could be that you can consider to define a new struct to wrap the pointer and implement the logic on this new struct.

Something like following.

type MyPointer struct {
    *ActualPointer
}

Here is a workable sample for you. Hope it can help you.

type T struct {
    shouldClone bool
    data        []string
}

type Pointer struct {
    *T
}

values := map[string]Pointer{
    "shouldClone": {
        T: &T{
            shouldClone: true,
            data:        []string{"a", "b", "c"},
        },
    },
    "shouldNotClone": {
        T: &T{
            shouldClone: false,
            data:        []string{"a", "b", "c"},
        },
    },
}
clone.SetCustomFunc(reflect.TypeOf(Pointer{}), func(allocator *clone.Allocator, old, new reflect.Value) {
    p := old.Interface().(Pointer)

    if p.shouldClone {
        np := allocator.Clone(old).Interface().(Pointer)

        // Update the cloned value to make the change very obvious.
        np.data = append(np.data, "cloned")
        new.Set(reflect.ValueOf(np))
    } else {
        new.Set(old)
    }
})

cloned := clone.Clone(values).(map[string]Pointer)
fmt.Println(cloned["shouldClone"].data)
fmt.Println(cloned["shouldNotClone"].data)

// Output:
// [a b c cloned]
// [a b c]

Live demo: https://go.dev/play/p/3V6VoHPRabd

jayaprabhakar commented 6 months ago

Thanks. I did consider this solution, but that would work if clone is the only thing the application does. But there are lot more stuff that goes on, and having to use a wrapper in place of a pointer (Note type alias doesn't work), at every code path using this is a pain.

I would love to hear more on why you think there will be a significant performance hit for supporting custom function for pointers to struct? Are you worried, supporting custom functions to pointers to struct would impact non-users of the feature as well or would it slow down only when the feature is used?

jayaprabhakar commented 6 months ago

A quick hack. https://github.com/huandu/go-clone/pull/24

I did not implement this for other cases like cloneSlowly or other variants. I would love to implement it if you are okay with this approach. Do let me know.

huandu commented 6 months ago

why you think there will be a significant performance hit for supporting custom function for pointers to struct?

If the SetCustomFunc supports both struct and pointer, it means the method Allocattor.loadStructType will be called for every pointer visited. As you can see, it uses several reflect methods to do its job and should not be called many times. I'm aware of calling this method too many times.

A quick hack. #24

I'll review it later, maybe next Tue. I'm on vacation now. If it's possible, can you please tell me more about your use case with some key code? It can help me to find out the best soluttion.

jayaprabhakar commented 6 months ago

I want a way to retain the relationship between the pointer values, when cloning. From my understanding cloneSlowly function does that for a single clone operation. For me, I need a way to be able to use the same visited graph if possible across multiple invocations.

huandu commented 6 months ago

@jayaprabhakar I got your point. I think it's OK to add some features in Allocator to provide more powerful APIs with some performance penalty for complex use cases.

I'll submit a PR soon to address your need.

jayaprabhakar commented 6 months ago

Thanks a lot.

For additional context on what I am trying to build - I am implementing a new model checker - the first step of it is state space exploration. For example: https://fizzbee.io/tutorials/probabilistic-modeling/#simulating-a-6-sided-die-with-a-fair-coin (Simulate a fair 6-sided die with a fair coin, fizzbee will explore every possible path.)

To do that, every fork points where we have to choose a path between multiple options, I will deep clone the entire state-space, and evaluate each path separately one after another until no new state is found. Unfortunately cloning the state-space is not that simple. For example: a parent struct, has multiple fields. I want to clone a few of them, but not all. Each of these fields might be struct ptrs that themselves have some fields that should be cloned and some don't. In addition to that, I would have to preserve the relationship between the fields. If two references/ptrs point to same object before clone, then after cloning, both the ptrs should refer to a common new cloned object. To make it even harder, I use starlark to maintain python-like syntax. In starlark, after executing an expression, starlark freezes the value (so it doesn't allow any new modification). This cannot be simply cloned but I have to use custom functions.

With these, if I clone the container/parent/root object with clone.Slowly(), it won't work simply, because I need to add custom function to skip a few fields. But those individual fields must be deep cloned as well. But it doesn't have access to the same visitor graph :( So, clone.Slowly here will not be sufficient)

So, I think, I would need a way to reuse the visited map in cloneState. But I haven't fully gone through the code.

type cloneState struct {
    allocator *Allocator
    visited   visitMap
    invalid   invalidPointers

    // The value that should not be cloned by custom func.
    // It's useful to avoid infinite loop when custom func calls allocator.Clone().
    skipCustomFuncValue reflect.Value
}
jayaprabhakar commented 4 months ago

Hi @huandu, did you get a chance to work on this? Or can you give some guidance on how to do it?

huandu commented 4 months ago

@jayaprabhakar Sorry for the delay. I'm super busy these days and have no time to finish this change. I'll try to spare some time on this weekend.

jayaprabhakar commented 4 months ago

No problem, let me know if there is something I can do to make it easy for you to finish it.

I am facing an issue with Cloning a specific object type. I use another library starlark extensively, and when I do clone, for one of the types alone, the clone is not working.

m1 := starlark.NewDict(5)
m2 := clone.Clone(m1).(*starlark.Dict)

m2.SetKey(starlark.MakeInt(2), starlark.MakeInt(1))
fmt.Println("added to clone", m1, m2)

https://go.dev/play/p/SRQMnBef26Q

In this case, value added to m2 is ignored. I couldn't figure out exactly what causes this. The code for the library is here, https://github.com/google/starlark-go/blob/046347dcd1044f5e568fcf64884b0344f27910c0/starlark/value.go#L841

Is there some specific feature that is causing this issue.

I can try again using Custom Fn but the issue would be with deduplication. I want to use clone.Slowly(), and I wanted to let you know so if this is a missing feature or a bug in the clone implementation.