mahmoud / glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
https://glom.readthedocs.io
Other
1.88k stars 61 forks source link

Recurive wildcard produces a StopIteration exception when called on objects containing iterators #254

Open lululaplap opened 1 year ago

lululaplap commented 1 year ago

Hello,

I am trying to use the recursive wildcard feature on a nested dictionary. Some of the objects in the dictionary have attributes which are iterators. However, after the glom search the iterators produce a StopIteration exception on the next call to next. I have produced a minimal example below:

import glom
import pprint

class MyData:
    def __init__(self):
        self.data = iter([1,2,3])

b = MyData()
target = {"A": {"b": b}}
spec = "A.**"
result = glom.glom(target, spec)
pprint.pprint(result)
next(b.data)

Which produces:

[{'b': <__main__.MyData object at 0x7f7b06390550>},
 <__main__.MyData object at 0x7f7b06390550>,
 <list_iterator object at 0x7f7b06390f70>,
 1,
 2,
 3]
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Cell In[48], line 6
      4 result = glom.glom(target, spec)
      5 pprint.pprint(result)
----> 6 next(my_iter.data)

StopIteration: 

Is this the excepted behaviour? If so, is it possible to limit how far the recursion goes? I would like to be able to limit it to only look inside the nested dictionary and not the objects contained within.

Many thanks, Lewis

kurtbrose commented 1 year ago

That is kind of expected behavior. ** is going to keep going as deep as it can.

If know the depth that you want to go, a *.*.* type path can work to not go too far.

Switch, Match and Flatten could be combined to put a little bit of logic around which types of things to iterate on.


Ref('unpack',
    Match(
        Switch({
            dict: Auto('*'),  # only go 1 level on dicts
            object: Auto(('*', Flatten([Ref('unpack'])) # for everything else, recurse and combine results into one iterable
        }))
)

Another possibility is to use the TargetRegistry to override default behaviors: https://glom.readthedocs.io/en/latest/api.html#setup-and-registration

But, honestly, I think you are better off using a custom self-recursive function if you need to get really fine-grained about when you do and don't want to iterate.

lululaplap commented 1 year ago

Cheers, I will have a look into the example you gave. I think on reflection limiting the API to my application is going to be the solution.