purescript / purescript-strings

String utility functions, Char type, regular expressions.
BSD 3-Clause "New" or "Revised" License
54 stars 71 forks source link

CodePoints.uncons performance optimization? #154

Open jamesdbrock opened 3 years ago

jamesdbrock commented 3 years ago

It seems to me that these lines in Data.String.CodePoints.uncons

https://github.com/purescript/purescript-strings/blob/157e372a23e4becd594d7e7bff6f372a6f63dd82/src/Data/String/CodePoints.purs#L197-L198

are first slicing the first code unit into a Char string with the JavaScript charAt method

https://github.com/purescript/purescript-strings/blob/157e372a23e4becd594d7e7bff6f372a6f63dd82/src/Data/String/Unsafe.js#L5

and then converting the Char string to a CodePoint by the boundedEnumChar instance fromEnum method which calls the Javascript charCodeAt method.

https://github.com/purescript/purescript-enums/blob/170d959644eb99e0025f4ab2e38f5f132fd85fa4/src/Data/Enum.js#L4

We could skip the intermediate string slice of the charAt method and call charCodeAt directly.

JordanMartinez commented 3 years ago

Is it doing that because it makes it easier on other backends?

jamesdbrock commented 3 years ago

Is it doing that because it makes it easier on other backends?

Maybe. That's a good point.

jamesdbrock commented 3 years ago

I tried swapping in a “fast” CodePoints.uncons function in purescript-parsing and couldn't detect any speedup.