Oldes / Rebol-issues

Issue tracker for https://github.com/oldes/Rebol3
4 stars 0 forks source link

Can we have set operations on map! please? #1984

Closed Siskin-Bot closed 1 year ago

Siskin-Bot commented 4 years ago

Submitted by: Sunanda

In R2 I can use hash! for fast set operations, eg:

  dataset1: to-hash [1 2 3]
  dataset2: to-hash [1 3 5]
find dataset1 1    ;; membership yes/no
datasetu: union dataset1 dataset2      ;; set of all members
datasetd: difference dataset1 dataset2 ;; set of differences
dataseti: intersect dataset1 dataset2  ;; set of common members

These other datasets are hash!s themeselves and so are fast to search.

In R3 I can do set membership like this:

  dataset1: to-map [1 true 2 true 3 true]
  dataset2: to-map [1 true 3 true 5 true]
select dataset1 1    ;; membership yes/no

And I can do other operations like this:

datasetu: union words-of dataset1 words-of dataset2      ;; set of all members
datasetd: difference words-of dataset1 words-of dataset2 ;; set of differences
dataseti: intersect words-of dataset1 words-of dataset2  ;; set of common members

But what I end up with is a block! not a map!

I know I can convert datasetu etc to map! but that's quite an overhead in code.


Imported from: CureCode [ Version: r3 master Type: Wish Platform: All Category: Native Reproduce: Always Fixed-in:none ] Imported from: https://github.com/rebol/rebol-issues/issues/1984

Comments:

Rebolbot commented on Mar 6, 2013:

Submitted by: BrianH

Yes please! UNIQUE could just be a shallow copy, but the rest would be useful. And it might make sense to do a UNION, INTERSECT, DIFFERENCE or EXCLUDE with a map and a block that would be interpreted as a collection of keys. Their /skip refinements can be ignored though, just as they are with SELECT.

Note that since keys are unique anyway we can just compare on keys. No #428 /skip issues here.


Rebolbot commented on Mar 6, 2013:

Submitted by: abolka

+1 as well.

Note that this leaves open the question of what to do with values for the UNION and INTERSECT set operations. While there are several useful possibilities, I suggest implementing the most straightforward one: by definition, have the value from the second (*) series argument ("SET2") survive:

    >> union map [a 1 b 2] map [b 3 c 4]
    == map [a 1 b 3 c 4]
    >> intersect map [a 1 b 2] map [b 3 c 4]
    == map [b 3]

(*) Alternatively, define that the value from the first series argument survives. That may be better if we consider this definition giving "precedence" to the two arguments, and that the first argument can be considered "more important" than the second argument.

On the other hand, the definition of having values from the second series argument survive is nicely in line with how the MAP constructor behaves:

    >> map [a 1 b 2 b 3 c 4]
    == map [a 1 b 3 c 4]

(With this reading, UNION could for example be considered as the result of joining both map body blocks and than creating a map from that.)


Rebolbot commented on Mar 6, 2013:

Submitted by: BrianH

The set functions themselves are otherwise first-wins, so it would make sense to be first-wins here as well. For example:

>> get first union reduce [in construct [a: 1] 'a] reduce [in construct [a: 2] 'a]
== 1  ; the first a won

And since map operations are all implied /skip 2 /compare 1, we won't have to worry about those being the default and assumed behavior.


Rebolbot mentioned this issue on Jan 22, 2016: [Epic] Holes in our evaluation model


Rebolbot added the Type.wish on Jan 12, 2016


Oldes commented 4 years ago

Just a note.. union of 2 maps can be done:

>> append copy m1: #(a 1 b 2) body-of m2: #(b 3 c 4)
== #(
    a: 1
    b: 3
    c: 4
)

and with reversed winner:

>> append copy m2 body-of m1
== #(
    b: 2 ;<---
    c: 4
    a: 1
)
Oldes commented 1 year ago

@dsunanda it's implemented now ;-)