LibreCat / Catmandu

Catmandu - a data processing toolkit
https://librecat.org
177 stars 31 forks source link

Adding support for empty paths #397

Closed phochste closed 1 year ago

phochste commented 1 year ago

Adding support for empty paths in fixes.

E.g.

echo '{"": "Empty", "ok": 1}' | catmandu convert JSON --fix "remove_field('')"

should give

{"ok": 1 }
nicolasfranck commented 1 year ago

@phochste wouldn't it be better to make this new behaviour optional? For example by adding an extra option? Otherwise we'll break previous behaviour.

nichtich commented 1 year ago

I doubt that anyone wants the current behaviour as it raises warnings and results in a null operation:

$echo '{"": "Empty", "ok": 1}' | catmandu convert JSON --fix "remove_field('')"
Use of uninitialized value $key in string eq at /usr/share/perl5/Catmandu/Path/simple.pm line 401.
Use of uninitialized value $key in string eq at /usr/share/perl5/Catmandu/Path/simple.pm line 401.
Use of uninitialized value $key in string eq at /usr/share/perl5/Catmandu/Path/simple.pm line 401.
[{"ok":1,"":"Empty"}]
nicolasfranck commented 1 year ago

@nichtich those warnings are the result of comparing strings and undefined keys with each other. That is a totally different issue (which should be fixed too).

The current vacuum clears "empty" values. This PR deletes values when the keys are considered empty. What if someone accidentally puts useful information in an empty key? Just delete it?

Why not create a separate fix vacuum_key (can someone come up with a better name?) to make this explicit, and make that fix only delete values with "empty" keys. And maybe document the current behaviour of vacuum.

phochste commented 1 year ago

@nicolasfranck This pull request is not about the vacuum fix. It is about addressing parts of JSON with an empty key.

As far as I know, in all fixes path are written at design time. It is the creator of a fix that writes a path. I can't imagine a writer of a fix will write a JSON-path and expects it always to result in a null operation.

I should make an issue out of this. The behavior in current Catmandu is weird. For some operations empty fields seem to be supported by obviously it is a bug.

echo '{"":"abc"}' | catmandu convert JSON --fix "move_field('','y')"
Use of uninitialized value $key in string eq at /Users/hochsten/.plenv/versions/5.24.0/lib/perl5/site_perl/5.24.0/Catmandu/Path/simple.pm line 401.
Use of uninitialized value $key in string eq at /Users/hochsten/.plenv/versions/5.24.0/lib/perl5/site_perl/5.24.0/Catmandu/Path/simple.pm line 401.
Use of uninitialized value $key in string eq at /Users/hochsten/.plenv/versions/5.24.0/lib/perl5/site_perl/5.24.0/Catmandu/Path/simple.pm line 401.
[{"":"abc","y":{"":"abc"}}]

The y field is created but the empty field is still there. The current pull requests fixes this also.

There are still some open issues however I notice today. Even with the current pull request only empty fields at root level can be addressed.

For instance it is not possible to generate a document like the one below with only add_field fixes:

{
  "x": {
     "" : "test"
  }
}
nicolasfranck commented 1 year ago

@phochste right..

btw I thought that empty paths where equal to the root (which can also be written as a "dot")? that has been like this for some time.

phochste commented 1 year ago

The empty path seems indeed to be reserved for pointing to the root of the JSON hash. The update of the pull requests introduces empty paths explicitly (and support for nested empty paths):

# Add a new field with key => '' and value => 123
add_field("''",123)

# Add na new field within key => 'x' with key => '' and value => 123
add_field("x.''",123)

# Remove the key => ''
remove_field("''")

# Remove the key => '' in key => 'x'
remove_field("x.''")
nics commented 1 year ago

@phochste we should also handle double quotes 'remove_field("")'