Open hiiamboris opened 1 year ago
Great stuff @hiiamboris.
if all [deep once] [cause-error 'script 'bad-refines []] ;-- incompatible
Isn't it possible that you'd want to replace a single value, anywhere in a deep structure? I agree that the normal case is /all
with /deep
, and this could be relaxed later if asked for. First use case I think of is that some data value shows up in an error scenario, and you want to test with the data to see if it happens in every case, or what related data there may be.
unless any-block? series [deep: off]
This is an interesting one. If we remove paren support from paths, they shouldn't be nestable, right?
Also on paths, are we sure we want this behavior?
>> replace1/deep copy [a b c [a/b/c]] 'b 'x
== [a x c [a/x/c]]
In some cases that will be exactly what we want, but should we always treat block/list the same?
Does size
need to be dynamic?
;; pattern size will be found out after first match:
size: [size: offset? match find/:case/:same/:only/match/tail match :pattern]
It's niftier than size: either only [1][length? pattern]
, but even with the comment I had to scan and parse things a few times to figure it out as future references and do
use are tricky. It is a nice use of fixed+dynamic refinements though.
Setup thought, just more straight-line:
start: series ;-- starting offset may be adjusted if part is negative
limit: system/words/case [
none? limit [tail series]
integer? limit [skip start limit] ;-- convert limit to series, or will have to update it all the time
; no action if already a series.
]
if back?: negative? offset? start limit [ ;-- flip start/end if limit was negative, always go forward
start: limit
limit: series
]
I'm reminded that I don't love =?
, preferring same?
. I don't think it's a great symbol, given the meaning. ===
seems better if we're escalating comparison strictness, from = == ===
, but even then I prefer same?
for explicitness.
Isn't it possible that you'd want to replace a single value, anywhere in a deep structure? I agree that the normal case is
/all
with/deep
, and this could be relaxed later if asked for. First use case I think of is that some data value shows up in an error scenario, and you want to test with the data to see if it happens in every case, or what related data there may be.
This case sounds made up to me ;) But anyway, if replace/once
returns position after the replacement, what should it return in deep mode? Another series? How do you continue your replacement from another series? This is the main issue.
unless any-block? series [deep: off]
This is an interesting one. If we remove paren support from paths, they shouldn't be nestable, right?
Paths constructed at runtime will still be.
Also on paths, are we sure we want this behavior?
>> replace1/deep copy [a b c [a/b/c]] 'b 'x == [a x c [a/x/c]]
In some cases that will be exactly what we want, but should we always treat block/list the same?
Not sure, but paths are troublesome without it. Question is, is it worth another refinement, e.g. /paths?
There's also another direction here: to run deep replacement on all inner series - strings, binaries, vectors. Should we have control over it or we just forward everyone into mapparse
with this?
Does
size
need to be dynamic?;; pattern size will be found out after first match: size: [size: offset? match find/:case/:same/:only/match/tail match :pattern]
It's niftier than
size: either only [1][length? pattern]
, but even with the comment I had to scan and parse things a few times to figure it out as future references anddo
use are tricky. It is a nice use of fixed+dynamic refinements though.
Let it be your homework then to find cases where size: either only [1][length? pattern]
doesn't work :)
I'm using find
because my head would explode from an attempt to figure out a reliable future-proof and concise way of manually keeping it in sync with find
.
Setup thought, just more straight-line:
That may be less nested, but also heavier for the no-limit case.
I haven't by the way considered the past-tail index case here. I'd prefer a language-wide solution for it, rather than patching holes arbitrarily.
Good thoughts. More questions. :^\
That may be less nested, but also heavier for the no-limit case.
I'm all for profiling it in the grand scheme of replace
. :^)
I'm all for profiling it in the grand scheme of
replace
. :^)
OK (init time, sample size=10):
Cool, I can't tell what those numbers mean though, unless it makes the whole thing 10% slower, which is surprising.
It's no surprise. offset?
is a mezz.
OK then, just flip the blocks so the check is none? limit
.
Done.
Cool. How is =?
better than none?
here, for my edification, as it's non-idiomatic?
Cool. How is
=?
better thannone?
here, for my edification, as it's non-idiomatic?> clock/times [1 =? none] 1e7 0.12 μs [1 =? none] > clock/times [none? 1] 1e7 0.25 μs [none? 1]
unless pos =? match
I stumbled for a couple seconds there, as I wondered what was =?
... Seriously, everybody is using the much more readable same?
, we should remove =?
entirely, as it's really non-intuitive. You could simply write unless same? pos match
...
Also:
>> clock/times [none =? limit] 1e7
0.06 μs [none =? limit]
== false
>> clock/times [not limit] 1e7
0.06 μs [not limit]
@hiiamboris, which was why I said " in the grand scheme of replace" initially. profile
tells the story nicely.
+1 for removing =?
, but then I feel the same about unless
. Do you remember why Carl added it? :^)
Rationale
Current
replace
implementation was stiched together by different people with different small needs and no central design:find
toparse
mode when /deep is used and in other hard to predict cases, totally changing semantics and disabling the possibility of/same
comparison and fast hash lookupsfind
mode, adds loop properties to it and makes a little step intomap-each
/mapparse
territorychange
makes it of quadratic complexity, seriously limiting applicabilityfind
features (/only, /same, /part) are not supported by itSee also https://github.com/red/red/issues/4945 previously ignored
Proposal
New implementation:
replace
(Parse is left tomapparse
)value
is a function, it's just insertedapply
https://github.com/red/REP/issues/83Arguments in favor of
mapparse
and againstreplace/parse
refinement:pattern
, but alsovalue
changes meaning in /parse modereplace
expects a constant value (it's optimized for that), whilemapparse
evaluates a code block (how to tell /parse mode if block should be treated as code or as a value)?mapparse
is consistent withforparse
, and if we integrate it intoreplace
, what to do withforparse
?forparse
andmapparse
are loops, so they supportbreak
&continue
in their code blocks, so their logic is differentmapparse
has different argument order - same as the other loops (foreach, repeat)Performance comparison
I have made two variants of the new implementation:
replace2
is straightforward and compiler-friendly, whilereplace1
uses blocks of code to make the loop tighter (but they are interpreted).replace/all
is the old implementation.Due to compiler benefits
replace2
usually wins overreplace1
. And considering that get-words in paths cannot be compiled yet, and it includes workarounds for https://github.com/red/red/issues/5319 and https://github.com/red/red/issues/5320 , it should perform better once those kludges are removed. Compared to the old implementation, performance is much more predictable and linear, with worst case slowdown currently down to 23% due to allocation of a new buffer.Tests
replace
test suite adapted for the new version is here. Some tests were refitted withmapparse
. Some currently fail due to https://github.com/red/red/issues/5321