Open n-e opened 3 years ago
Thank you for opening this issue and discovering such an edge case!
Scany indeed iterates all fields recursively in getColumnToFieldIndexMap()
func and gets into an infinite loop with such a struct.
To solve that issue we can introduce a threshold like 2 or 3: how many recursive loops are allowed. When we do the broad-first traversal we need to count the number of parents that have the same type as the current element and ensure that it is not bigger when the threshold, otherwise we need to skip that element and don't propagate further.
Let me know if that solution solves your problem or feel free to propose how you see it.
Hi,
Thanks for your answer. Indeed that would solve my problem!
hi, is there any development on this issue? ran into the same problem today
Hi @CyganFx, there was an attempt to add this feature to scany: https://github.com/georgysavva/scany/pull/60, but the PR got inactive. I am not currently working on this feature. If you are interested in helping, I would gladly accept a PR from you.
getColumnToFieldIndexMap()
could alternatively return map[reflect.Type]map[string]int
. And next make map[string]int[]
in the structScan()
method while iterating through the column dbTags
.
The approach would be fairly complicated when we compare to limiting the recursion. But it would be fully cacheable and multithreadable with sync.Map
.
Ran into this issue as well.
Thanks for the feedback, guys. Would any of you like to tackle this?
Could try to tackle this issue in the near future. sqlx/reflectx/reflect.go could be used as a point of reference.
Should we implement caching of getColumnToFieldIndexMap()
at the same time?
sync.Map
would work out of the box fairly well, but it doesn't scale as well as some third party synchronized maps. puzpuzpuz/xsync.MapOf
would be better at scalability since it splits the map in to buckets. And the buckets themselves have the lock instead of sync.Map having a lock on the entire map. But third party map implementations can't directly hash reflect.Type
and to achieve native performance with the third party map implementation requires few hacks with unsafe.Pointer
s shown in this blog post https://www.dolthub.com/blog/2022-12-19-maphash/
@anton7r thank you for deciding to tackle this issue. I appreciate that.
Yes, you are also welcome to add caching for the reflection map. Here is an issue requesting exactly that: https://github.com/georgysavva/scany/issues/25. Just do it as a separate PR because the feature requested in this PR and map caching aren't related.
For caching, I would go with the default sync.Map
implementation or just the native map[..]...
protected by a sync.RWMutex{}
. A single global lock with no further optimization is okay because the only time the lock is activated is when we write to the cache map, which happens only when a new type definition is being used with scany. And since the number of types in an application is finite, I expected most of them to get cached when the application starts serving incoming requests. After that, all operations on the cache map are read-only, meaning sync.RWMutex{}
doesn't prevent them from working in parallel. Correct me if I am wrong here.
Hi,
I'd like to scan into a recursive struct:
This causes an infinite loop as scany introspects struct recursively.
Any tips for achieving that? I might be missing something obvious since I'm new to Go and scany. Ideally I'd like not to touch the Node struct definition since in my real-life code it is auto-generated. Also I'd like to avoid using
db:"-"
on Parent since I might want to get the immediate parent with a join.Thanks!