qt4cg / qtspecs

QT4 specifications
https://qt4cg.org/
Other
28 stars 15 forks source link

Allow manipulation of maps and arrays #77

Open joewiz opened 3 years ago

joewiz commented 3 years ago

As discussed in the xml.com Slack workspace's xpath-ng channel, there is interest in extending the XQuery Update Facility to allow manipulation of maps and arrays—in effect, to facilitate the editing of large, deep JSON documents.

For example, @DrRataplan provided this use case (the first code snippet can be viewed at fontoxml's playground):

I think XQUF for JSON may have its merit. Editing larger JSON documents using XQuery is not the most elegant. I mean, in JavaScript, changing a value in a deep map is theMap['key']['deeperKey'].push(42). In XPath, it is more like:

$theMap 
=> map:put('key', $theMap?key)
=> map:put('deeperKey', array:append($theMap?key?deeperKey, 42)))

In XQUF terms, I think this would look a bit like:

insert 42 as last into $theMap?key?deeperKey

... which is at least a lot shorter.

At some point when working on a project that tried to edit some JSON metadata objects in XQuery I implemented a function that accepted a map, a path of keys, a value and some semantics, such as inserting at the start vs. at the end. It did not work too great in the end and we went for JavaScript functions instead. Just too explicit and hard to debug.

See also this discussion at StackOverflow, where a user was struggling to use map:put or map:remove on deeper entries in a map; asked, "Is XQuery 3.1 designed for advanced JSON editing?"; and worried that XQuery "might not be the right choice" for his use case. Highlights from the responses:

@michaelhkay wrote:

You're correct that doing what I call deep update of a map is quite difficult with XQuery 3.1 (and indeed XSLT 3.0) as currently defined. And it's not easy to define language constructs with clean semantics. I attempted to design a construct as an XSLT extension instruction - see https://saxonica.com/documentation10/index.html#!extensions/instructions/deep-update -- but I don't think its anywhere near a perfect solution.

@ChristianGruen wrote:

Updates primitives had been defined for JSONiq (https://www.jsoniq.org/docs/JSONiqExtensionToXQuery/html-single/index.html#section-json-updates), but I believe they haven’t made it into the reference implementation. They could also be considered for XQuery 4.

@michaelhkay responded:

If I'm not mistaken, maps in JSONiq have object identity, which is not true of XQuery maps (which are pure functional data structures). That makes the semantics of deep update much easier to define, but makes it more difficult to make simple operations such as put() and remove() efficient.

In Slack @liamquin also wrote:

the proposals i've seen for this in the past required that maps and arrays be given identity in some way, but then you have the problem that e.g. map:insert returns a new map, which is not how an XQuery update expression works

@jonathanrobie also wrote:

Yes, but the first question is this: how much will is there to support JSON updates in XQuery update?

I would love to have this. I no longer work for an implementation of XQuery.

@adamretter added:

Sounds like a nice idea

benibela commented 3 years ago

See also this discussion at StackOverflow, where a user was struggling to use map:put or map:remove on deeper entries in a map; asked, "Is XQuery 3.1 designed for advanced JSON editing?"; and worried that XQuery "might not be the right choice" for his use case. Highlights from the responses:

And it has moved to the mailing list

jonathanrobie commented 3 years ago

There's a gotcha we have to be careful about. Maps and arrays must be able to contain nodes without copying them like XML node constructors.

Maps and arrays are often used to identify nodes to be modified. I wrote this today:


declare function local:range($start)
{
  $start,
  remainder($start)
};

declare function local:ranges($root)
{
  for $start in $root//w
  where continues($start)
  and fn:not(continues($start/preceding-sibling))
  return array { range($start) }
};

declare updating function local:merge($range)
{
  merge_morphcodes(fn:head($range), fn:tail($range))
  ,
  merge_node_text(fn:head(fn:head($range)), fn:tail($range))
  ,
  delete nodes fn:tail($range)
};

for $r at $i in ranges( db:open("oshb-morphology") )
return merge(array:flatten($r))

This only works because maps and arrays do not create new identities. If I used an element constructor instead of an array in this query, it would create a new copy of each child element and the updates would modify only the transient copy. Using an array, the elements in the array retain their identity and updates are applied to the instance in the database.

ChristianGruen commented 1 year ago

Here are functions that we have used in the past to delete, replace and update nested map entries:

declare namespace map = 'http://www.w3.org/2005/xpath-functions/map';
declare namespace maps = 'maps';

(:~
 : Recursively removes map entries.
 : @param  $input  input (map, any other item)
 : @param  $keys   path to entry to delete
 : @return updated item
 :)
declare function maps:delete(
  $input  as item()*,
  $keys   as xs:string*
) as item()* {
  if($input instance of map(*)) then (
    map:merge(map:for-each($input, function($k, $v) {
      if($k = head($keys)) then (
        if(count($keys) > 1) then map:entry($k, maps:delete($v, tail($keys))) else ()
      ) else (
        map:entry($k, $v)
      )
    }))
  ) else (
    $input
  )
};

(:~
 : Recursively replaces map entries.
 : @param  $input  input (map, any other item)
 : @param  $keys   path to entry to delete
 : @param  $value  new value
 : @return updated item
 :)
declare function maps:replace(
  $input  as item()*,
  $keys   as xs:string*,
  $value  as item()*
) as item()* {
  if($input instance of map(*)) then (
    map:merge(map:for-each($input, function($k, $v) {
      map:entry($k, if($k = head($keys)) then (
        if(count($keys) > 1) then (
          maps:replace($v, tail($keys), $value)
        ) else (
          $value
        )
      ) else (
        $v
      ))
    }))
  ) else (
    $input
  )
};

(:~
 : Recursively updates map entries.
 : @param  $input  input (map, any other item)
 : @param  $keys   path to entry to update
 : @param  $value  function that creates the new value
 : @return updated item
 :)
declare function maps:update(
  $input   as item()*,
  $keys    as xs:string*,
  $update  as function(item()*) as item()*
) as item()* {
  if($input instance of map(*)) then (
    map:merge(map:for-each($input, function($k, $v) {
      map:entry($k, if($k = head($keys)) then (
        if(count($keys) > 1) then (
          maps:update($v, tail($keys), $update)
        ) else (
          $update($v)
        )
      ) else (
        $v
      ))
    }))
  ) else (
    $input
  )
};

Example invocations:

ChristianGruen commented 12 months ago

Please see the comments in #832 for possible solutions to this feature request.