`istruthy()` function - Githubissues

gregsdennis commented 1 year ago

Because a lot of languages support a concept of "truthiness," many path authors may rely on the truthiness of the data they query.

We've already decided against supporting truthiness in general (which I 100% agree with; not asking to change that). However, to return this functionality back to the user, I would like to propose an explicit istruthy() (or maybe just truthy()) function.

It would take a single ValueType parameter and return a LogicalType as follows:

An input of null, false, 0, "", [], {}, or Nothing would all return LogicalFalse.
An input of any other value returns LogicalTrue.

This would allow things like

$[?istruthy(@.a)].foo

to select "data we want" from

[
  {"a": true, "foo": "data we want"},
  {"a": false, "foo": "data we don't want"}
]

This function could be combined with value() to allow nodelist conversions:

$[?istruthy(value(@.*))]

If @.* is an empty nodelist or a multiple nodelist, then value() returns Nothing, for which istruthy() returns LogicalFalse
if @.* is a single-nodelist, then value() extracts that value, and istruthy() converts that value per the above
- the nodelists <null> and <""> will be converted to LogicalFalse
- the nodelists <1> and <[null]> (a nodelist containing an unempty array) will be converted to LogicalTrue

That last one (<[null]>) may be a bit counter-intuitive, but that's how it would process under these rules.

cabo commented 1 year ago

There are several ways of defining truthiness, and I'm not sure there is one that is widespread. (E.g., for me, Nothing, null, and false are the non-truthy instances of ValueType, while 0, "", {}, [] are of course truthy. We have empty() functions for the latter, and maybe also a blank() to keep blank-space-only strings falsy.)

I'm saying this because these are maybe not among the functions that need to go into the base standard. (I would argue that round() is way more important. And sum(). And, and, and. Don't forget npv :-))

cabo commented 1 year ago

gregsdennis commented 1 year ago

The truthy conversions I listed are the ones I've seen the most (and mostly pulled from JS). I agree that they're different based on language.

For example, C doesn't have a concept of boolean; it just interprets zero as false and non-zero as true. As a result, it's common to add compiler directives to define the symbols true and false (capitalization may vary) as 1 and 0, respectively.

As another example C# doesn't define truthiness at all. An expression (e.g. in an if statement) must return a bool (which the comparison operators do), however you can explicitly convert from a numeric type to bool using the Convert.ToBoolean() static function, which follows the same interpretations as C. (This is the C# equivalent of the proposed istruthy().)

(The entry for C# in that table is wrong.)

We would have to define what they mean for us.

I think this could be an important function (maybe not included in the spec) that gives some functionality back to the user.

cabo commented 1 year ago

I think the interesting question will be how to curate a reasonable set of registered functions that we expect many implementations (and not just the Cadillac ones) to provide. Providing the registry is the first step towards this. A designated expert who can recognize that programming environments have different definitions/concepts of truthy and therefore not one of them should be called "istruthy" is another.

When we did CBOR, we were aiming for "batteries included" with respect to the tag (type extension) registry. Some of these initial registrations are rather essential, others turned out to be much less widely used (some because they were wrong or unusable). Considering this experience, I'm quite happy with the frugal set we now have in the JSONPath document; I'd expect people to come up and define empty() and blank() (maybe not npv()) very soon. We could also try to get the WG rechartered to create a canonical set.

danielaparker commented 1 year ago

We've already decided against supporting truthiness in general

But the draft does have a notion of truthiness, for example, in

$..book[?(@.isbn)]

the expression @.isbn evaluates to LogicalTrue or LogicalFalse depending on whether the property isbn exists, which is a rule for truthiness.

Languages that don't have truthiness require an actual true or false value when evaluating an expression as a boolean, otherwise it's an error. Julia is an example of such a language.

I don't see the point of a truthy function. As noted in other comments, there is no agreement. Some languages consider 0 to be false (following C), others consider it to be true (Ruby, Lisp.) Some consider an empty array to be false (Python, Common Lisp), others consider it to be true (Ruby, Scheme.)

gregsdennis commented 1 year ago

We've already decided against supporting truthiness in general

But the draft does have a notion of truthiness, for example, in
$..book[?(@.isbn)]
the expression @.isbn evaluates to LogicalTrue or LogicalFalse depending on whether the property isbn exists, which is a rule for truthiness.

This isn't completely (pedantically) accurate. The context here is a test-expr, and the result is explicitly defined that the node is selected if @.isbn returns a non-empty nodelist. There is no conversion to LogicalType, thus truthiness isn't required.

Semantically, yes, it appears to be truthiness, but it's not defined that way.

Furthermore, if @.isbn were to return a JSON false value, the node would be selected, even though truthiness would not select it.

I don't see the point of a truthy function. As noted in other comments, there is no agreement. Some languages consider 0 to be false (following C), others consider it to be true (Ruby, Lisp.) Some consider an empty array to be false (Python, Common Lisp), others consider it to be true (Ruby, Scheme.)

The point of it is to provide a translation from ValueType to LogicalType in a way that interprets JSON values as boolean-ish; so JSON true/false could be interpreted as LogicalTrue/LogicalFalse, which is a behavior that many users would expect to be supported. Hitherto, such a conversion doesn't exist.

cabo commented 1 year ago

the expression @.isbn evaluates to LogicalTrue or LogicalFalse depending on whether the property isbn exists, which is a rule for truthiness.

Well, if it would look at the value of @.isbn, and then arbitrarily put some values into the false bin and all others into the true bin, I would agree. But as Greg points out, here we have an existence test that doesn't look at the value at all.

cabo commented 1 year ago

Hitherto, such a conversion doesn't exist.

We just don't call it a "conversion".

@.foo != false

can be used to "convert" a JSON Boolean to a logical expression. As can be

@.foo == true

which has a very different meaning (the same only if it is ensured there is a value that that is either false or true).

gregsdennis commented 1 year ago

Hitherto, such a conversion doesn't exist.

We just don't call it a "conversion".
@.foo != false
can be used to "convert" a JSON Boolean to a logical expression. As can be
@.foo == true
which has a very different meaning (the same only if it is ensured there is a value that that is either false or true).

That's not a conversion. It's evaluating an expression.

danielaparker commented 1 year ago

We've already decided against supporting truthiness in general

But the draft does have a notion of truthiness, for example, in
$..book[?(@.isbn)]
the expression @.isbn evaluates to LogicalTrue or LogicalFalse depending on whether the property isbn exists, which is a rule for truthiness.
This isn't completely (pedantically) accurate.

I think it is.

The context here is a test-expr,

In programming language terms, a test-expr is a conditional, and it is in conditionals that truthiness applies.

and the result is explicitly defined that the node is selected if @.isbn returns a non-empty nodelist.

Defining rules such that things that are not true/false, e.g. @.isbn, become true/false is precisely what truthiness is about.

cabo commented 1 year ago

OK, so let me summarize on what we all agree on.

Many programming environments have a concept of truthiness. They are all different, there is no consensus on how this concept should be defined in detail. There is no need for a function extension just in order to access this functionality: A logical expression with test expressions and comparisons can always be constructed. Function extensions implementing specific versions of truthiness can be added later, so they do not need to be in the base document.

So I believe we can close this as "not planned" for the base document. Please reopen if you disagree.

ietf-wg-jsonpath / draft-ietf-jsonpath-base

`istruthy()` function #447