Unexpected behavior when mixing numbers and characters on a name

clj-commons / camel-snake-kebab

A Clojure[Script] library for word case conversions

https://clj-commons.org/camel-snake-kebab/

Eclipse Public License 1.0

477 stars 48 forks source link

Unexpected behavior when mixing numbers and characters on a name #42

Open ricardojmendez opened 7 years ago

ricardojmendez commented 7 years ago

There's unexpected behavior when mixing numbers and characters on an identifier which is not demonstrated on any of the tests.

This makes sense:

(->kebab-case-keyword :user_id)
=> :user-id

(->kebab-case-keyword :user_1)
=> :user-1

But then...

(->kebab-case-keyword :user1_124)
=> :user-1-124

(->kebab-case-keyword :user1)
=> :user-1

It gets worse:

(->kebab-case-keyword :object1a2)
=> :object-1a-2

While I can see from the source this is the intended grouping behavior, this goes agains the principle of least surprise. I'm not sending a PR to change it since I expect it would break existing use (need to assume that someone, somewhere is relying on this).

Thoughts? What was the rationale behind this?

This looks related to #22.

jhereth commented 6 years ago

Looking at the discussion in #22 I understand that there is no universal solution to how to process digits. Even worse when not having separators as in @ricardojmendez's example.

Is there any way to specify that numbers should be treated as part of the preceeding part and not be treated as part of itself, e.g. instead of

user=> (csk/->snake_case_keyword "FooSP1")
:foo_sp_1

I'd like to have :foo_sp1.

qerub commented 6 years ago

@ricardojmendez:

Thoughts? What was the rationale behind this?

Hmm… The behavior was set 3+ years ago and I don't remember the details, but judging from the tests in string_separator_test.cljc I wanted a string like "Adler32" to be split into ["Adler", "32"] which still makes sense.

Regardless of whether this behavior is good or not, it's too late to break it as you say. The only sustainable way to resolve this issue (and others like it) is to extend the API to allow specifying the input case. This will be implemented in response to #38 and I am hoping to have this ready during my winter/Xmas vacation.

qerub commented 6 years ago

@daten-kieker:

The best workaround at the moment is to specify your own word separator pattern, e.g.:

(->snake_case_keyword "FooSP1" :separator #"(?<![A-Z])(?=[A-Z])")
; => :foo_sp1

https://www.regular-expressions.info/lookaround.html has some good docs describing the negative lookbehind and positive lookahead used here.

jhereth commented 6 years ago

@qerub Thanks a lot - this not only solved my issue but also taught me something new about regexps.