jqlang / jq

Command-line JSON processor
https://jqlang.github.io/jq/
Other
30.33k stars 1.57k forks source link

Extending some builtins #1450

Open fadado opened 7 years ago

fadado commented 7 years ago

I'm using the following filters in my code:

# not/1: complement for empty and identity filters
def not(stream):
    if isempty(stream) then . else empty end
;

# length/1: produces  length for finite streams
def length(stream):
    reduce stream as $_ (0; .+1)
;

I kown this can be a dangerous practice because jq developers perhaps have future plans for not/1 and length/1

What if my definitions are accepted as new builtins? They do not polute the global namespace...

JJOR

pkoppstein commented 7 years ago

@fadado - First, I would like to recommend that you use the name "count" for your length/1. There are two main reasons:

  1. It avoids any confusion or potential future conflict with the existing "length"
  2. In English, it is exactly the right word for a stream, as in "Count how many ducks floated by" (count(isDuck)). There is no verb corresponding to "length", which is usually used for measurement, not counting.

An additional but very tiny consideration is that the name "count" with this meaning has already escaped into the wild :-)

Second, I would recommend against using the name "not" for your not/1 as it is potentially very confusing:

a) jq has conventional boolean and and or operators, so newcomers (and others) easily fall into the trap of thinking that not(x) "should" have its conventional meaning. b) not(s) could quite naturally have the semantics of negating all the members of s, as in:

 def not(s): s | not;

c) One could argue that the "logical complement" of isempty(s) is its negation (isempty(s) | not)

By the way, I don't think I've ever come across the need for your not/1 so I'd be curious to know what prompted you to define it.

On the other hand, I've often felt that some form of when/2 like the one defined in the jq FAQ would be extremely handy. Here is its current definition in the FAQ:

def when(cond; s): if cond?//null then s else . end;

This (or something very much like it) is very handy for example when using walk/1 and with_entries/1. It makes reading easier, saves some typing and often is more compact vertically. Compare:


    walk( if type == "object" then with_entries(...) else . end )

    walk( when( type=="object"; with_entries(...))

Your thoughts?

fadado commented 7 years ago

@fadado - First, I would like to recommend that you use the name "count" for your length/1.

Ok. Thank you for your advices.

Second, I would recommend against using the name "not" for your not/1 as it is potentially very confusing:

Ok, perhaps i will name the function toggle:

def toggle: select(isempty(.));

By the way, I don't think I've ever come across the need for your not/1 so I'd be curious to know what prompted you to define it.

If you take empty, ., ,, and | you have almost all the components for a (problematic) boolean algebra. Only is missing the complement function: this is the reason for define toggle.

The boolean algebra laws coded in jq (the only values for a, b and c are empty and . ) are as follows (see the problematic laws marked with =?):

Commutativity

a , b == b , a
a | b == b | a

Associativity

a , (b , c) == (a , b) , c
a | (b | c) == (a | b) | c

Distributivity

a , (b | c) =? (a , b) | (a , c)
a | (b , c) == (a | b) , (a | c)

Identity element

a , empty == a
a | . == a

Annihilation

a , . =? .
a | empty == empty

Idempotence

a , a =? a
a | a == a

Uniting

(a | b) , (a | toggle(b)) == a
(a , b) | (a , toggle(b)) == a

Absorption

a , (a | b) == a
a | (a , b) == a

Adsorption

(a , toggle(b)) | a == a | b
(a | toggle(b)) , a == a , b

Complementation

a , toggle(a) == .
a | toggle(a) == empty

Involution

toggle(toggle(a)) == a
toggle(empty) == .
toggle(.) == empty

De Morgan's Laws

toggle(a , b) == toggle(a) | toggle(b)
toggle(a | b) == toggle(a) , toggle(b)

The problems are the non-distributy of , over |, the non-commutatitivy of , and paradoxal equations like (. , .) == . (true twice!).

This is the boolean part of my investigations on Kleene Algebra with Tests.

JJOR

pkoppstein commented 7 years ago

Ok, perhaps i will name the function toggle: def toggle: select(isempty(.));

I think you should stick to your original proposal, as the above is equivalent to empty. Your original proposal, after renaming, and modification to use select/1, was:

def toggle(s) : select(isempty(s));

When thinking about the algebra of jq's operators, it is important to remember that . has a special role in jq. E.g. one cannot obtain a count of . by writing count(.), as the latter is always a stream of 1s.

A valid equation relating (s,s) to (s) is:

 count(s,s) === 2 * count(s)   # (*)

where === means "equivalent to" (i.e. e === f iff FORALL streams s, (s|e) produces the same stream as (s|f)).

Of course (*) holds for . too 😅

pkoppstein commented 7 years ago

@fadado - It seems to me that some of the "identities" you enumerate are incorrect.

Consider this excerpt:

Absorption a , (a | b) == a a | (a , b) == a

If a and b are both ., then both these identities become: (.,.) == ., which seems wrong unless s == t means the jq expressions==t produces a stream of 0 or more true values`.

In any case, in this discussion at least, I would suggest that we avoid using == except with reference to jq's ==. I believe the above stream-oriented definition of === is appropriate when discussing either streams or jq expressions. If there is a need to refer to a stream of 0 or more true values, then perhaps all(s;true) will suffice?