mahmoud / glom

☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
https://glom.readthedocs.io
Other
1.88k stars 61 forks source link

Recursive Delete-If-Empty? #273

Open mahmoud opened 9 months ago

mahmoud commented 9 months ago

Chatting with @hynek tonight, seems like there's a market for a Delete/delete flag to remove the value at a path if it's empty, and its parent if it's empty, and so on.

Kind of like rm -r (or rmdir, in that it fails/stops if the container isn't empty).

(Tangentially related, remap can do this for a whole object, but it's not very convenient for recursively deleting a specific path.)

kurtbrose commented 9 months ago

The easiest way to do that would be as a custom Mode -- maybe Trim?

~ 15 LOC I can whip something up real quick

kurtbrose commented 9 months ago

Hmm... I guess delete is an in-place mutation rather than returning a copy.

I think this would be easier to do as a follow-on step rather than mixing it into the guts of delete.

val = {"a": {"b": 1} }
glom(val, (Delete("a.b"), Trim()))
# val is now {}
kurtbrose commented 9 months ago

Let me see if this reasoning makes sense:

Recursive deleting "along a path" is probably not so much what is intended as recursive deletion of a subtree. That is, you just need to get the glom target to the right place and then let it go.

The interesting bits:

kurtbrose commented 9 months ago

A switch would almost work -- except this requires post-evaluation not pre-evaluation. Switch({bool: T}, default=SKIP) or maybe And(bool, T, default=SKIP). Then wrap it with a Ref. The hard part is it is going to go parent-to-child, whereas for this operation you want to evaluate child-to-parent. Same with ** in a path.

I wonder if there's something general there. Kind of like T is the target going in, is there some expression or mechanism when the return depends on the subspec evaluation.

It's trivial with a custom spec -- the return value is passing through glomit().

If there was such a thing as post-evaluation **, then trim might look like this: ('**', Or(bool, Delete()))

kurtbrose commented 9 months ago

Traverse() was going to be the more general **, maybe this is a test use-case?

Of course, plain python recursion would have better performance than bouncing each item through glom.

mahmoud commented 9 months ago

For this case, a copy was fine, and remap did the job. Agree that is_empty should be a param. But bool alone doesn't capture the "empty container" common case; it'll notably remove 0/0.0 which are often important values.

I think a recursive path trim makes sense as a separate step/spec. Even if it's not "context-free" for the whole tree like remap, i suspect it's still useful, as I've occasionally had APIs explode at empty values on specific paths only.

(additional context: this was a pre-step for db persistence, to minimize the size of a record)