uiua-lang / uiua

A stack-based array programming language
https://www.uiua.org
MIT License
1.52k stars 106 forks source link

find ⌕ "ab" "abracadabra" example: shapes differ by 1 #158

Closed tanpau closed 10 months ago

tanpau commented 10 months ago

find ⌕ "ab" "abracadabra" from the examples evaluates to vector [1 0 0 0 0 0 0 1 0 0] - △ "abracadabra" is [11], △ [1 0 0 0 0 0 0 1 0 0] is [10]. Is this wanted behavior?

If not, is it possible to extend the matching vector of find to something like [1 2 0 0 0 0 0 1 2 0 0] in this case, or in another cases:

⌕ "aa" "abaabbccaab" => [0 0 1 1 0 0 0 0 1 1 0]

⌕ "aab" "abaabbccaab" => [0 0 1 1 2 0 0 0 1 1 2]

Would make partitoning much more easy ...

Awesome language btw :)

iFreilicht commented 10 months ago

Yes, as find basically creates windows over the input and checks each one for equality, you're always going to end up with a length that is s-x+1 where s is the length of the array you're searching in and x is the length of the array you're searching for.

I'm not sure I understand why you would want to partition here. If you use the partitions you suggest, you'll just get back an array like ["aa" "aa"] or ["aab" "aab"]. What are you actually trying to achieve here?

kaikalii commented 10 months ago

I have contemplated making find put 1s for every element that is part of the found pattern, not just the first, as that's often what I want. I'm not sure what consequences this would have though.

tanpau commented 10 months ago

Thanks for the responses, didn't know that find uses ◫ windows, then it makes sense.

I'm new to APL & Co world, started with some yt-videos from code report from AoC2022 and tried to translate some Dyalog-Solutions from there to Uiua ... in this case I had many problems with parsing the input from Day1-Problem and came up with this solution:

`p ← &rs ∞ &fo "input/aoc_2022_d01_0.txt" ⍤∶≅, "1000\n2000\n3000\n\n4000\n\n5000\n6000\n\n7000\n8000\n9000\n\n10000" p

q ← ( =0⌕"\n\n".p # find (start)indeces of "subarray" ⇌⊂1⇌ # fix shapes (search-chars minus 1 added at end) ⊜□ # box results because different lengths ) ⍤∶≅, {"1000\n2000\n3000" "\n4000" "\n5000\n6000" "\n7000\n8000\n9000" "\n10000"} q

r ← ≡(□⊜'parse□ =0⌕"\n" .⊔) q ⍤∶≅, {[1000 2000 3000] [4000] [5000 6000] [7000 8000 9000] [10000]} r

Sums ← ≡'/+⊔ r ` So I "solved" the problem, but maybe not very idiomatic. With the proposed behavior of find i could partition inhomogenious input data directly and more generic (without "corrective measures") I guess ... I wasn't sure whether its a bug because 1-element search terms yield same shapes, obviously ... and I haven't used windows till now ...

Even though Uiua is more approachable for me then the others, its still a mighty learnig curve ...

But fun :)

ArseniyKorobenko commented 10 months ago

I agree that putting ones for every letter of the found string would be more useful. You could then use \> to get the old behavior, (EDIT: that function does the wrong thing actually.) while doing the inverse is considerably harder.

Regarding the solution you posted, the fact that find only shows the first matched letter doesn't really matter in this case: here's how I would parse it

ArseniyKorobenko commented 10 months ago

the reason find a b returns an array of length b-a+1 is because there's, for example, only 3 ways of fitting a string with len 2 ("++” in this example) in a string with len 4: "++--" "-++-" "--++" and since "++" couldn't possibly start on the last index, the last index gets cut off

tanpau commented 10 months ago

thanks for ⊜(□⊜parse≠,@\n)¬⊂∶0⌕"\n\n".s yes - for this case its irrelevant that find only shows the first matched letter, because the remaining \n have to be stripped by a second partion anyway ..

but in a modified doku example like find ⌕ "ab" "babracadabra" from the examples evaluates to vector [0 1 0 0 0 0 0 0 1 0 0] there is no easy way to partition on the search string with finds result - there "putting 1s for every element that is part of the found pattern" like kaikalii pondered would do the trick (with padding length of searchstring -1 at the end ..)

but that maybe worth (or not) another discussion - so I guess we could close this thread (of my misunderstanding how find works ... ;)

kaikalii commented 10 months ago

The behavior of find has been changed in 5e58bb29dbab3d.