Open dnovatchev opened 2 years ago
Are there other languages that offer anything similar?
Are there other languages that offer anything similar?
Yes:
Total maps: see for example Coq
Decorators: present in many programming languages with different names:
And @michaelhkay , here is an excellent introduction (a module from the Pluralsight course: "Python: Beyond the Basics"
):
The essence of maps as currently defined is their simplicity, and I don't want to lose that.
I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:
map:with-default($map, $default)
returns function($x){ if (map:contains($map, $x)) then $map($x) else $default }
Plus an extension to the semantics of the lookup operator '?' so that the LHS can be an arity-1 function. (But the RHS is atomised, so I think to maintain consistency with the current semantics it would have to be a function whose argument type is singleton atomic.)
I'm also wondering about a way of deriving a function from a map in such a way that the function is more strongly typed, but I'm finding it hard to come up with anything better than just writing a function that wraps the map explicitly:
function($x as xs:integer) as xs:string* {map:with-default($map, "")?$x}
The essence of maps as currently defined is their simplicity, and I don't want to lose that.
I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:
map:with-default($map, $default)
returnsfunction($x){ if (map:contains($map, $x)) then $map($x) else $default }
Using an explicit "default" symbol seems more simple and visually intuitive, like in the below:
let $prize-table := map {
"ten" : 10,
"hundred" : 20,
"thousand" : 100,
"five-thousand" : 1000,
"prime" : 25000,
"\" : 0
}
We already have too-many map:xxx()
functions and many people find that the ?
operator has too-many different roles.
But certainly a function like this is a possibility. I would prefer it to be named more like:
map:add-default(...)
This said, I prefer even more to have a symbol for a default key. In the proposal this is "\"
. I understand that someone could argue that there might be cases when we would need this specific value to be used just as a regular key.
To eliminate the possibility of such an issue, we could establish a new fn:default()
that can be used only as the default-key and only within a map as a key. Thus the above would become:
let $prize-table := map {
"ten" : 10,
"hundred" : 20,
"thousand" : 100,
"five-thousand" : 1000,
"prime" : 25000,
default() : 0
}
If we were to do that, I would go for a keyword such as #default
.
However, I'm strongly inclined towards having a self-imposed rule for this round: no data model changes.
If we were to do that, I would go for a keyword such as
#default
.However, I'm strongly inclined towards having a self-imposed rule for this round: no data model changes.
This is exactly why instead of a new special keyword #default
we can use the proposed function default()
. This is just a function (constant, like the existing math:pi()
) and just adding a function doesn't change the data model.
I don't believe we should add a special-default-case solution just for maps. There are so many other expressions that could benefit from, but don't have a default branch.
I would rather recommend users to take advantage of the upcoming otherwise expression (or ?:
as alternative writing, as currently realized in BaseX).
I like the @michaelhkay proposal because, if I understand correctly, the following syntax seems quite neat:
let $prize-table := map {
"ten" : 10,
"hundred" : 20,
"thousand" : 100,
"five-thousand" : 1000,
"prime" : 25000
} => map:with-default(0)
Regarding the ?
lookup operator - given that we use *
on the RHS to return all map values, perhaps we could use +
on the RHS to return the default value, giving:
let $default := $prize-table?+
I think using the function with-default()
as proposed by @michaelhkay makes sense and it seems quite visually appealing and intuitive if written in combination with the arrow operator:
let $myMap := map{"x" : 1, "y" : 2} => with-default( 0 )
It would be more visually appealing if this function could be called without specifying a prefix, so why not define it in the standard function namespace?
Plus an extension to the semantics of the lookup operator '?' so that the LHS can be an arity-1 function. (But the RHS is atomised, so I think to maintain consistency with the current semantics it would have to be a function whose argument type is singleton atomic.)
I believe this was already proposed in a separate proposal by @ChristianGruen : [XPath] Generalize lookup operator for function items
The essence of maps as currently defined is their simplicity, and I don't want to lose that.
I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:
map:with-default($map, $default)
returnsfunction($x){ if (map:contains($map, $x)) then $map($x) else $default }
@michaelhkay After careful consideration, I am satisfied with the function so proposed.
Just for readability I would prefer this to be in the standar fn;
namespace, thus instead of having to write:
$someMap => map:withDefault(5)
we can omit any prefix and its namespace binding and simply write:
$someMap => withDefault(5)
Plus an extension to the semantics of the lookup operator '?' so that the LHS can be an arity-1 function. (But the RHS is atomised, so I think to maintain consistency with the current semantics it would have to be a function whose argument type is singleton atomic.)
We already have a similar proposal by @ChristianGruen : [XPath] Generalize lookup operator for function items
Also, in the starting proposal, we showed that the lookup operator ?
for any decorated map $dm
(and a total map is a special case of a decorated map) is identical to, and should be defined as:
$dm($base-map, ?)
I'm also wondering about a way of deriving a function from a map in such a way that the function is more strongly typed, but I'm finding it hard to come up with anything better than just writing a function that wraps the map explicitly:
function($x as xs:integer) as xs:string* {map:with-default($map, "")?$x}
Shouldn't this be:
function($x as xs:integer) as xs:string* {map:with-default($map, () )?$x}
I'm wondering if the most common use-case can be met by providing convenient ways to turn a map into a function:
map:with-default($map, $default)
returnsfunction($x){ if (map:contains($map, $x)) then $map($x) else $default }
@michaelhkay, In a previous comment I expressed approval for this idea. 💯
Please note however, that the users will certainly want to treat the result of the with-default
as a map and not as just as a function.
Therefore, I would prefer the definition and implementation of the function as in the following expression:
let $withDefault := function($map as map(*), $default as item()*) as map(*)
{
let $extenderKey := xs:float("-INF"),
$props := if(map:contains($map, $extenderKey))
then map:put($map($extenderKey), "$default", $default)
else map{ "$default" : $default }
return
map:put($map, $extenderKey, $props)
}
return
(
$withDefault(map{"x" : 2, "y" : 3}, 10),
"==============",
$withDefault($withDefault(map{"x" : 2, "y" : 3}, 10), 15)
)
Evaluating this correctly produces:
map {
xs:float("-INF"): map {
"$default": 10
},
"x": 2,
"y": 3
}
==============
map {
xs:float("-INF"): map {
"$default": 15
},
"x": 2,
"y": 3
}
We will modify the behavior of the map:XXX
functions not to deal with the $extenderKey
and its values, except in the cases when $extenderKey
is specifically provided as argument.
One could argue what specific value we should choose for the key $extenderKey
. We can agree on any newly-generated GUID, or any other suitable value.
Different programming languages handle this in different ways. For example, in Python double-underscore mangling is the way to maintain distinct names, as explained here
Once again I’d like to draw the attention to the new otherwise
expression, which I believe is a much more generic solution to lookups (not just in maps) that don't yield results. It's available in many other languages as well (https://en.wikipedia.org/wiki/Null_coalescing_operator). Do we really need and want a solution specific to maps? What about arrays and sequences, and what about lookups that return an empty sequence as value?
Do we really need and want a solution specific to maps? What about arrays and sequences, and what about lookups that return an empty sequence as value?
@ChristianGruen This is a very good question!
I completely agree that if we just wanted to provide a default-value capability to a map, there could be ways to achieve this with a null-coalescing function/operator.
However, this issue is about maps and ways to extend a map - not only with a default value, but potentially with new capabilities, that are not planned in advance. For example, a decorated map, that knows about the sequence of all decorators that have been applied to itself. Obviously the null-coalescing cannot help to achieve this, but placing all new-capabilities-related data within a nested map - can.
Addressing the concern that focusing to "just maps" is something not so generic, let us remember that maps in XPath are analogous to objects in the OOP world. Is there something even more generic than objects?
Copied from https://github.com/qt4cg/qtspecs/issues/707#issuecomment-1722299446 (I think it’s better to discuss it here):
It is easy to see that this current proposal has a few shortcomings (the last one critical):
- In many cases we may want just an error to be raised, thus returning the empty sequence will need more coding and checking (as above)
You could replace the default/fallback value with a function that creates the result (similar to https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-map-get);
Copied from #707 (comment) (I think it’s better to discuss it here):
It is easy to see that this current proposal has a few shortcomings (the last one critical):
- In many cases we may want just an error to be raised, thus returning the empty sequence will need more coding and checking (as above)
You could replace the default/fallback value with a function that creates the result (similar to https://qt4cg.org/specifications/xpath-functions-40/Overview.html#func-map-get);
A _programmer's proverb_:
"Every problem can be solved with introducing one more level of indirection - except the problem of having too-many levels of indirection"
😄
I am not sure why it was labeled as "PRG-hard", while this is trivial to implement.
@dnovatchev I assume it was labeled as such because we will seem to have no one who will create a PR for it. If we had one, I am sure people (= at least me) would look at it.
I am not sure why it was labeled as "PRG-hard", while this is trivial to implement.
@dnovatchev I assume it was labeled as such because we will seem to have no one who will create a PR for it. If we had one, I am sure people (= at least me) would look at it.
@ChristianGruen ,
At the time I wrote this proposal I had no logistics experience whatsoever in creating a PR for the specs repository.
If we want not to have to change the XDM, then we need to allow a single special key-value that will be used to specify missing-key default value and/or a map that can contain many other "system properties". This has to be a "special", non-atomic value, in order to avoid any possible collision with any key that is currently allowed and maybe used in existing code.
It is possible to change the XPath language specs and add text specifying that such a value can be used as a key.
The only problem is that we need to agree which exactly value to use. I would propose to use [ ]
(the empty array), but any other special value would do.
Any suggestions from anyone, please?
At the time I wrote this proposal I had no logistics experience whatsoever in creating a PR for the specs repository.
Absolutely; that’s no judgement, just (as I understand it) the assessment how the labels were assigned.
I would propose to use [ ] (the empty array), but any other special value would do.
This would probably be equivalent to the empty sequence, as keys in maps are defined to be atomized.
At the time I wrote this proposal I had no logistics experience whatsoever in creating a PR for the specs repository.
Absolutely; that’s no judgement, just (as I understand it) the assessment how the labels were assigned.
I would propose to use [ ] (the empty array), but any other special value would do.
This would probably be equivalent to the empty sequence, as keys in maps are defined to be atomized.
I am not sure if a key that is the empty sequence is not allowed at present. Even if this value is "free to use", it would be difficult and awkward to specify it in the definition of a map - at the same time there is no such difficulty in specifying [ ]
at the same time there is no such difficulty in specifying [ ]
I think it would be difficult for me, but I may be wrong. I'm looking forward to it.
If we want not to have to change the XDM, then we need to allow a single special key-value that will be used to specify missing-key default value and/or a map that can contain many other "system properties". This has to be a "special", non-atomic value,
If you allow a map to contain a non-atomic key then you have changed the data model, and this impacts every operation currently defined on maps. That's why it's labelled "hard". But remember that the labels were assigned based on about 5 minutes study. What's hard for some people might be easy for others.
If we want not to have to change the XDM, then we need to allow a single special key-value that will be used to specify missing-key default value and/or a map that can contain many other "system properties". This has to be a "special", non-atomic value,
If you allow a map to contain a non-atomic key then you have changed the data model, and this impacts every operation currently defined on maps. That's why it's labelled "hard". But remember that the labels were assigned based on about 5 minutes study. What's hard for some people might be easy for others.
If we allow the empty-sequence as a key value, then we are not changing the XDM, are we?
If we allow the empty-sequence as a key value, then we are not changing the XDM, are we?
Of course we are. The XDM does not currently allow the empty sequence as a key value, and we would be changing that. Lots of decisions need to be made in consequence, for example how this affects map types and subtyping rules, how it affects functions such as map:get, map:put, map:pairs, map:keys and map:size, etc etc. When you get into the detail, some of these decisions will inevitably be non-trivial.
Maps with Infinite Number of Keys: Total Maps and Decorated maps
1. Total Maps
Maps have become one of the most useful tools for creating readable, short and efficient XPath code. However, a significant limitation of this datatype is that a
map
can have only a finite number of keys. In many cases we might want to implement a map that can have more than a fixed, finite number of arguments.Here is a typical example (Example 1):
A hotel charges per night differently, depending on how long the customer has been staying. For the first night the price is \$100, for the second \$90, for the third \$80 and for every night after the third \$75. We can immediately try to express this pricing data as a map, like this:
We could, if we had a special key, something like "TheRest", which means any other key-value, which is not one of the already specified key-values.
Here comes the first part of this proposal:
"\"
Adding such a "discard symbol" makes the map a total function on the set of any possible XPath atomic items.
Now we can easily express the hotel-price data as a map:
Another useful Example (2) is that now we can express any XPath item, or sequence of items as a map. Let's do this for a simple constant, like π:
the map above is empty (has no regular keys) and specifies that for any other key-value
$k
it holds that$π($k) eq math:pi()
Going further, we can express even the empty sequence (Example 3) as the following map:
Using this representation of the empty sequence, we can provide a solution for the "Forgiveness problem" raised by Jarno Jelovirta in the XML.Com
#general
channel in March 2021:This expression will raise an error:
[XPTY0004] Input of lookup operator must be map or array: 1.
To prevent ("forgive", thus "Forgiveness Problem") the raising of such errors we could accept the rule that in XPath 4.0 any expression that evaluates to something different than a map or an array, could be coerced to the following map, which returns the empty sequence as the corresponding value for any key requested in a lookup:
To summarize, what we have achieved so far:
"\"
(discard key) is a total function over the value-space of allxs:anyAtomicType
values?
without raising an error.2. Decorated Maps
Although we already achieved a lot in the first part, there are still use-cases for which we don't have an adequate map solution:
In the example (1) of expressing the hotel prices, we probably shouldn't get
$75
for a key such as -1 or even"blah-blah-blah"
But the XPath 4.0 language specification allows any atomic values to be possible keys and thus to be the argument to themap:get()
function. If we want validation for the actually-allowed key-values for a specific given map, we need to have additional processing/functionality.With a discard symbol we can express only one infinite set of possible keys and group them under the same corresponding value. However, there are problems, the data for which needs several infinite sets of key-values to be projected onto different values. Here is one such problem:
Imagine we are the organizers of a very simple lottery, selling many millions of tickets, identified by their number, which is a unique natural number.
We want to grant prizes with this simple strategy.
None of the sets of key-values for each of the 6 categories above can be conveniently expressed with the
map
that we have so far, although we have merely 6 different cases!How can we solve this kind of problem still using maps?
Decorators to the rescue...
What is decorator, what is the decorator pattern and when it is good to use one? According to Wikipedia:
The idea is to couple a map with a function (the decorator) which can perform any needed preprocessing, such as validation or projection of a supplied value onto one of a predefined small set of values (that are the actual keys of the map). For simplicity, we are not discussing post-processing here, though this can also be part of a decorator, if needed.
Let us see how a decorated-map solution to the lottery problem looks like:
Conclusion
In the 2nd part of this proposal, a new type/function -- the
decorated-map
was described.We defined the signature of a
decorated-map
and gave an example how to construct and use one in solving a specific problem. In particular, the proposal is to have a standard function:decorated-map ($base-map as map(*), $input as xs:anyAtomicType) as item()*
Finally, we showed that the lookup operator
?
on a decorated map $dm is identical to and should be defined as :$dm($base-map, ?)
What remains to be done?
The topic of decorators is extremely important, as a decorator may and should be possible to be defined on any function, not just on maps. This would be addressed in one or more new proposals. Stay tuned 😊