ocaml / ocaml.org

The official OCaml website.
https://ocaml.org
Other
160 stars 318 forks source link

Use uucp caselesseq instead of structural equality and String.ascii_lowercase #2444

Open cuihtlauac opened 4 months ago

cuihtlauac commented 4 months ago

In the ocaml.org source code, strings are compared or searched, ignoring cases (i.e. in a case-insensitive manner). Most often, this is done using String.lowercase_ascii and either OCaml structural equality (=) or standard library functions such as String.sub or String.begins_with.

However, as @Octachron has noted, this is reckless. We'd better use robust, i18n-aware string functions from Uucp's library. Since this library is already part of what ocaml.org pulls, this does not create dependency considerations. See: https://github.com/ocaml/ocaml.org/pull/2442

There are several tasks involved here:

  1. [ ] Locate places where case-insensitive string comparison takes places
  2. [ ] Use Uucp functions to perform those comparisons
  3. [ ] Check no regression takes place
sagnikc395 commented 2 months ago

ran some grep to find the files where case-insensitive string comparision takes place:

  1. String.sub
    ./global/import.ml:9:12:        if String.sub s1 i len = s2 then raise Exit
    ./string_uppercase:1:3:./string_sub:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
    ./string_uppercase:2:3:./string_sub:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
    ./string_uppercase:3:3:./string_sub:./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
    ./string_uppercase:4:105:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
    ./string_uppercase:5:105:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
    ./string_uppercase:6:82:./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
    ./ocamlorg_data/data.ml:120:14:          if String.sub s1 i len = s2 then raise Exit
    ./ocamlorg_frontend/pages/outreachy.eml:32:65:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
    ./ocamlorg_frontend/pages/outreachy.eml:43:65:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
    ./ocamlorg_frontend/pages/package_overview.eml:27:35:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
    ./ocamlorg_frontend/pages/home.eml:311:28:                      <%s! String.sub item.body 0 (min (String.length item.body - 1) 100) %>...
    ./ocamlorg_frontend/components/search.eml:24:49:      if content_length < length then text else String.sub text 0 length
    ./ocamlorg_frontend/components/search.eml:32:12:      else String.sub text (content_length - length) length
  2. String.uppercase_ascii
    ./string_sub:./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
    ./string_sub:./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
    ./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub project.mentee 0 1); %>
    ./ocamlorg_frontend/pages/outreachy.eml:                                    <%s String.uppercase_ascii (String.sub mentor 0 1); %>
    ./ocamlorg_frontend/pages/package_overview.eml:      <%s String.uppercase_ascii (String.sub user.name 0 1); %>
  3. String.lowercase_ascii
    ./ocamlorg_package/lib/ocamlorg_package.ml:543:15:    let str = String.lowercase_ascii str in
    ./ocamlorg_package/lib/ocamlorg_package.ml:561:31:  let match_ f s pattern = f (String.lowercase_ascii @@ s) pattern
    ./ocamlorg_web/lib/handler.ml:218:19:    let pattern = String.lowercase_ascii pattern in
    ./ocamlorg_web/lib/handler.ml:219:33:    let name_is_s { name; _ } = String.lowercase_ascii name = pattern in
    ./ocamlorg_web/lib/config.ml:4:9:  match String.lowercase_ascii s with "true" | "1" -> true | _ -> false
    ./ocamlorg_data/data.ml:115:19:    let pattern = String.lowercase_ascii s in
    ./ocamlorg_data/data.ml:127:30:           contains pattern (String.lowercase_ascii name))
sagnikc395 commented 2 months ago

looking into uucp functions to make those comparisions