[FEAT] Add max depth parameter

romnn commented 4 years ago

This PR proposes to add the ability to specify a maximum depth when flattening. See the updated README.md for a usage example.

I needed this for my use case and would appreciate merging this feature as it does not break any behaviour but can be useful for others i think.

I am open for feedback and revisions if need be.

ianlini commented 4 years ago

Thanks for your contribution! The name max_depth is ambiguous. It may refer to the maximum depth in the resulting dict, or the maximum depth of the original dict this function will consider. I think you are implementing the latter one. I don't have a good idea now. max_traverse_depth or max_flatten_depth might be better.

romnn commented 4 years ago

Thanks for your feedback! I agree, however I think it would be uncommon to expect such an optional parameter to make assumptions about the input dictionary.

I prefer max_flatten_depth and could rename the parameter once we settle on a name :)

Have a great day!

ianlini commented 4 years ago

Please go ahead with max_flatten_depth if you think it's better :)

I meant the maximum depth in the resulting dict, not an assumption about the input dictionary. For example,

d = {
    'a': {
        'b': 'c'
    }
}

When I say max_depth=1, in your implementation, the output will be the same as input. If max_depth=1 means the output should not have depth > 1, then the output should be {'a_b': 'c'}. It's like a constraint of the output.

Please also modify the docstring to avoid this misunderstanding. Thanks!

romnn commented 4 years ago

Sorry it took me so long to get back to this, I was very busy. I refactored the zero depth test to be somehow more understandable. The reason for this is that the flatten function still transforms the keys to be tuples. Let me give an example:

# Suppose we have the following normal dict
normal_dict = {
    'a': '0',
    'b': {
        'a': '1.0',
        'b': '1.1',
    },
    'c': {
        'a': '2.0',
        'b': {
            'a': '2.1.0',
            'b': ['2.1.1.0', '2.1.1.1'],
        },
    },
}

# With the default reducer, the keys are transformed to tuples even with depth 0
flattened_depth_zero = {
    ('a',): '0',
    ('b',): {
        'a': '1.0',
        'b': '1.1',
    },
    ('c',): {
        'a': '2.0',
        'b': {
            'a': '2.1.0',
            'b': ['2.1.1.0', '2.1.1.1'],
        },
    },
}

# Note that a normal comparison is not working
print(flattened_depth_zero == normal_dict)
# returns False

# We want to test that the key <-> value mapping has not changed
before = sorted(normal_dict.items(), key=lambda x: x[0])
after = sorted(flattened_depth_zero.items(), key=lambda x: x[0])
after = [(x[0][0], x[1]) for x in after]
print(before == after)
# returns True

Of course one can argue about if this should be intended behaviour or if there should be an extra check for depth 0 that just returns the original dict.

Best, Roman

ianlini commented 4 years ago

@romnnn, Thanks for your contribution. I also have some other ideas in my mind, but I don't have enough time now, so I merged this PR without too much discussion. I respect all contributors' design, so I would like to let you know that I might change this design a little bit in the future.

ianlini / flatten-dict

[FEAT] Add max depth parameter #22