JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.82k stars 5.49k forks source link

`map(f, ::String)` requires `f` to return an `AbstractChar` #54580

Open Seelengrab opened 5 months ago

Seelengrab commented 5 months ago

As seen on discourse.

MWE:

julia> 'あ' |> UInt32
0x00003042

julia> map(UInt32, "あいうえお")
ERROR: ArgumentError: map(f, s::AbstractString) requires f to return AbstractChar; try map(f, collect(s)) or a comprehension instead
Stacktrace:
 [1] map(f::Type{UInt32}, s::String)
   @ Base ./strings/basic.jl:669
 [2] top-level scope
   @ REPL[5]:1

This should IMO return a Vector{UInt32} instead. The original error was added in https://github.com/JuliaLang/julia/commit/f08ba8d4de15ec4b7cd4abc79529610a8e4c4d85; the commit also has a comment already mentioning that it would be good not to be too restrictive here.

KristofferC commented 5 months ago

map on strings is a bit special in that map(f, str) is not equal to map(f, collect(str)) even though they iterate the same values.

julia> map(x->x+1, "foo")
"gpp"

julia> map(x->x+1, collect("foo"))
3-element Vector{Char}:
 'g': ASCII/Unicode U+0067 (category Ll: Letter, lowercase)
 'p': ASCII/Unicode U+0070 (category Ll: Letter, lowercase)
 'p': ASCII/Unicode U+0070 (category Ll: Letter, lowercase)

There should probably be an entry in the manual about this. Maybe there should have been a special stringmap or something for this operation but it does kind of do what you "expect".

Seelengrab commented 5 months ago

I think it's inconsistent - lots of things default to returning a Vector with an inferred eltype, depending on the function passed in. The underlying question that ought to be answered is whether map needs to be container-type preserving (i.e., map only "inside" of the container), which I don't think is something that we've enforced so far.

JeffBezanson commented 5 months ago

map usually is container-type preserving. I think it's better to do this operation with a comprehension than have the container type depend on the type of value returned by the function.

Seelengrab commented 5 months ago

If that's the contract of map, we should document that. There's also the issue that the default/most generic map explicitly constructs a generator and collects it:

https://github.com/JuliaLang/julia/blob/48964736fba36c8289749cc9a575b41fdda87dc8/base/abstractarray.jl#L3422

This fallback was added in a2ce2309863564aed3fba4f8927d586b11983219, so actually requiring same-container-type of map is definitely breaking. This is also what makes map over a String inconsistent with the default behavior.