red / REP

Red Enhancement Process
BSD 3-Clause "New" or "Revised" License
10 stars 4 forks source link

`rejoin` redesign #154

Open loziniak opened 1 year ago

loziniak commented 1 year ago

I tried rejoin on some different combinations of values, and it gives some unexpected results for me, given a description, that it "joins a block of values":

>> rejoin [[2 3][4 5]]
== [2 3 [4 5]]
>> rejoin [<tag1> <tag2>]
== <tag1<tag2>>
>> rejoin [quote (2 3) quote (4 5)]
== (2 3 (4 5))

I'd expect it to work like that:

>> rejoin [[2 3][4 5]]
== [2 3 4 5]                    ; OK
>> rejoin [<tag1> <tag2>]
== <tag1tag2>                   ; OK
>> rejoin [quote (2 3) quote (4 5)]
== (2 3 4 5)                    ; OK

And there is also a strange behavior with paths:

>> p: rejoin [quote a/b quote c/d]
== a/b/c/d
>> a: [b [c [d 10]]]
== [b [c [d 10]]]
>> get p
== none                    ; p is unexpectedly broken
>> p2: to path! [a b c d]
== a/b/c/d
>> get p2
== 10                      ; ...and should work like p2
>> to block! p
== [a b c/d]               ; something is wrong
>> to block! p2
== [a b c d]
hiiamboris commented 1 year ago

tag1<tag2>> is a problem of tags, not of rejoin. And in fairness the block/path issue can be fixed by rewriting the docstring of rejoin. So what are you proposing in the end?

loziniak commented 1 year ago

I see rejoin like this:

rejoin: func [
    "Reduces and joins a block of values." 
    block [block!] "Values to reduce and join"
] [
    if empty? block: reduce block [return block]
    joined: either series? first block [copy first block] [to string! first block]
    while [not empty? block: next block] [
        append joined first block
    ]
    joined
]

And I'd get over the tags, as well as emails and urls.

loziniak commented 1 year ago

Good to keep an eye on #134 too.

hiiamboris commented 1 year ago

I see rejoin like this:

Given its importance, this won't cut, because:

;; minimal code
>> clock/times [append form 1 [2 3 4 5 6 7 8 9 10]] 1e5
2.52 μs [append form 1 [2 3 4 5 6 7 8 9 10]]
== "12345678910"

;; native rejoin, interpreted
>> rejoin: func spec-of :rejoin body-of :rejoin
>> clock/times [rejoin [1 2 3 4 5 6 7 8 9 10]] 1e5
5.86 μs [rejoin [1 2 3 4 5 6 7 8 9 10]]
== "12345678910"

;; your one
>> clock/times [rejoin [1 2 3 4 5 6 7 8 9 10]] 1e5
16.1 μs [rejoin [1 2 3 4 5 6 7 8 9 10]]
== "12345678910"
loziniak commented 10 months ago

At least it's not an order of magnitude. Perhaps it could be optimized, I did not write it with speed in mind.

greggirwin commented 9 months ago

While I don't press for TDD most of the time, I do think that spec'ing expected results would be a good plan for this. So then we know what the edge cases and designed behavior are.

greggirwin commented 9 months ago

This would be slower than now, but faster than @loziniak's.

rejoin-foreach: function [
    "Reduces and joins a block of values." 
    block [block!] "Values to reduce and join"
] [
    if empty? block: reduce block [return block]
    joined: either series? first block [copy first block] [to string! first block]
    foreach val next block [append joined val]
    joined
]

But we have to consider breaking code as well. In this case I can't guess how much depends on the current behavior, and feel fairly safe if we decide to change it. But this also falls into design chat on join/combine in this space.

loziniak commented 9 months ago

spec'ing expected results would be a good plan for this

I'd start with examples from OP:

>> rejoin [[2 3][4 5]]
== [2 3 4 5]                    ; not: [2 3 [4 5]]
>> rejoin [quote (2 3) quote (4 5)]
== (2 3 4 5)                    ; not: (2 3 (4 5))
>> p: rejoin [quote a/b quote c/d]
== a/b/c/d
>> a: [b [c [d 10]]]
== [b [c [d 10]]]
>> get p
== 10                           ; not: none
>> to block! p
== [a b c d]                    ; not: [a b c/d]
hiiamboris commented 9 months ago

This seems nice and predictable but how would one make it output a string (primary use case)?

greggirwin commented 9 months ago

a: [b [c [d 10]]]

Same as now I suppose, by putting a string as the first value in the block. Or any non-block value for that matter.

greggirwin commented 9 months ago

But we should look at this vs repend as well. As Boris notes, building strings is the primary use case, but I imagine a lot of us have code that builds blocks with it as well. Also handy for building paths, given a root path.

greggirwin commented 9 months ago

Old join/combine idea: https://gist.github.com/greggirwin/24c022cb5dde5771531a8939309ccb43

I have an updated version with /only/case/exclude thoughts.