fsprojects / FSharpx.Extras

Functional programming and other utilities from the original "fsharpx" project
https://fsprojects.github.io/FSharpx.Extras/
The Unlicense
682 stars 146 forks source link

String.toPascalCase #429

Open AlbertoDePena opened 1 year ago

AlbertoDePena commented 1 year ago

Description

Feature request

[<RequireQualifiedAccess>]
module String =

      let toPascalCase (value: string) =
              let character = value.[0]
              let letter = character.ToString().ToUpper()

              let pascalCased = letter + value.Substring(1)

              pascalCased
gdziadkiewicz commented 1 year ago

Hi Alberto! The request is legit, but I'm not sure about the name toPascalCase. It looks more like capitalization(https://en.wikipedia.org/wiki/Capitalization) than PascalCase(https://en.wikipedia.org/wiki/Camel_case#Variations_and_synonyms) IMO. What do you think?

There are also some implementation details to work out like handling null and empty string. The proposed imp will blow up on them. We would also need to confirm the current approach to culture in the package and make sure it stays consistent.

Would you like to try implementing this and submitting a PR?

AlbertoDePena commented 1 year ago

@gdziadkiewicz

That's fair. The name can be changed... So to account for null and empty string are we saying it should return option or result type?

gdziadkiewicz commented 1 year ago

Hi Alberto, sorry for the delay. The type is fine, IMO option/result doesn't add value here.

I checked the other implementations and think we should go with not specifying the culture aka using the one that is set up globally (done by ToUpper() ).

As for the null handling I reconsidered it and it is kind of expected that you will get an exception if you pass null.

For the empty string I would go with empty string to mimick other methods/functions like ToUpper().

Consider something along the lines of:

open System

[<RequireQualifiedAccess>]
module String =
      let capitalize (s: string) =
              match s with
              | "" -> ""
              | s ->
                  let chars =  s.ToCharArray() //initially want to go with Array.ofSeq s, but it turned out to be slooooow
                  chars.[0] <- Char.ToUpper(chars.[0])
                  String chars

It does fewer string allocations than your implementation and should (I didn't check it) be quicker than the one you provided (it would be nice to microbenchmark this in order to make an informed decision about the implementation).

EDIT: I just "checked" it with #time in fsi and my idea of using Array.ofSeq s went through the window. Replaced it with ToCharArray()

Quick "testing" results for this approach:

> open System;;
>       let capitalize (s: string) =
-               match s with
-               | "" -> ""
-               | s ->
-                   let chars =  s.ToCharArray()
-                   chars.[0] <- Char.ToUpper(chars.[0])
-                   String chars;;
val capitalize: s: string -> string

> capitalize "";;
val it: string = ""

> capitalize "dog";;
val it: string = "Dog"

> capitalize "dog barks";;
val it: string = "Dog barks"

> capitalize null;;
System.NullReferenceException: Object reference not set to an instance of an object.
   at FSI_0016.capitalize(String s) in C:\Users\grzeg\stdin:line 53
   at <StartupCode$FSI_0018>.$FSI_0018.main@() in C:\Users\grzeg\stdin:line 60
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
Stopped due to error
>
gdziadkiewicz commented 1 year ago

Could you provide a PR for that? I can do the review, help with testing, perf, or anything you don't feel comfortable with.

gdziadkiewicz commented 1 year ago

Initial performance "check":

> for _ in 1..100000000 do ignore (capitalize "sdasdasdasdasdasdsadasdasdas");;
Real: 00:00:05.704, CPU: 00:00:05.734, GC gen0: 2550, gen1: 0, gen2: 0
val it: unit = ()

> for _ in 1..100000000 do ignore(toPascalCase "sdasdasdasdasdasdsadasdasdas");;
Real: 00:00:08.939, CPU: 00:00:08.968, GC gen0: 3315, gen1: 0, gen2: 0
val it: unit = ()