Closed popod closed 3 years ago
slug(' test !!!', { strict: true }).replace(/^-|-$/g, '')
will strip leading and trailing dashes, but I imagine that's not the answer you were looking for.
Removing any leading or trailing dash is the default behavior in the slug
module, so one option would be to switch to that module. (Full disclosure: I maintain that module.)
If there's interest in that being the default behavior for slugify, I'm happy to send a pull request. Would probably have to be a breaking change though, just to be super cautious about not breaking anyone's code unintentionally.
I think this should be the default behavior and I'd imagine people having this issue already trimming the slug themselves. I'd like to have that behavior by default here, but I don't see it as a breaking change necessarily.
Faced this issue today. Any update on when a new release coming with this fix?
Faced this issue today. Any update on when a new release coming with this fix?
I just published slugify@1.5.2 with trimming of leading and trailing replacement characters as the default behavior.
@Trott Hi and thank you for your replies and work !
What is the difference between slugify
and your slug
module (features, compatibility, performances, ...) ?
@Trott Hi and thank you for your replies and work !
What is the difference between
slugify
and yourslug
module (features, compatibility, performances, ...) ?
@popod They are very similar, having nearly identical APIs and having both evolved from the same code base. Both work in the browser and in Node.js with no dependencies.
I haven't tested/benchmarked, but I suspect slugify
would have a smaller code size (more important for the browser than for the server).
If legacy browser support is important to you, note that slug
works in the IE11 browser and I think slugify
does not.
These two things are related. The code size for slug
is a bit larger because it has polyfills for IE11.
The defaults for slug
are to return lowercase slugs that consist only of the 26 letters in the English alphabet [a-z]
. So it's more aggressive about stripping out punctuation and special characters.
slugify
has replacements for unusual characters like ♥
. By default, slug
strips that sort of thing out and replaces only alphabetic characters. (As with slugify
, you can add custom characters yourself. They just don't come out of the box with slug
.)
slugify('I ♥ Unicode!'); // 'I-love-Unicode!'
slug('I ♥ Unicode!'); // 'i-unicode'
slug('I ♥ Unicode!', { charmap: { '♥': 'heart' }}); // 'i-heart-unicode'
One edge case difference is that if slug
can't replace any of the characters in the string, it returns a slug that is the Base 64 encoding of the string. slugify
instead returns an empty string in that case.
slugify('☢ ☢ ☢'); // ''
slug('☢ ☢ ☢'); // '4piiiokyoidimki'
This feature is useful if you are writing something that might need to generate slugs in words for alphabets you may not predict in advance. For example, let's take a Kanji character, 日
:
slugify('日'); // ''
slug('日'); // '5pel'
It's possible that there are a few alphabetic characters not supported in one or the other of the modules, but the charmaps and locales should be largely equivalent (with the exception again of non-alphabetic characters like ♥ which slug
intentionally omits). They also have mostly or entirely the same locale options.
As for performance, I don't actually know. I'm going to put together a benchmark right now and see!
I wrote a quick benchmark, and it indicates that slugify
is significantly faster than slug
. I'm going to dig in and try to find out why. Maybe my benchmark sucks. Maybe slug
has some areas where it can be optimized. Maybe slug
sacrifices some perf enhancements in the name of IE 11 compatibility. Maybe slugify
gains performance by not worrying about edge cases that perhaps never actually come up anyway. And a million other possibilites.)
Most likely explanation is slug
can use some perf work.
Second most likely explanation is my benchmark sucks.
Here's my benchmark. Probably a good idea to write one that uses real world data from your own app and see if you get similar results or not.
const slug = require('slug')
const slugify = require('slugify')
let slugLib
let label
if (process.argv[2] === 'slug') {
slugLib = slug
label = 'slug'
} else {
slugLib = slugify
label = 'slugify'
}
const BENCHMARK_RUNS = 2048
const MAX_WORD_LENGTH = 16
const MAX_WORD_COUNT = 4
function random (max) {
return Math.floor(Math.random() * max + 1)
}
const chars = Object.keys(slug.charmap)
function getString () {
const wordCount = random(MAX_WORD_COUNT)
const wordLengths = Array.from({ length: wordCount }, () => random(MAX_WORD_LENGTH))
const words = wordLengths.map((wordLength) => Array.from({ length: wordLength }, () => chars[random(chars.length) - 1]).join(''))
return words.join(' ')
}
const strings = []
for (let i = 0; i < BENCHMARK_RUNS; i++) {
strings.push(getString())
}
console.time(label)
for (let i = 0; i < strings.length; i++) {
slugLib(strings[i])
}
console.timeEnd(label)
And, for the sake of science, here are some sample runs on my laptop:
$ node benchmark.js slug
slug: 65.195ms
$ node benchmark.js slug
slug: 66.769ms
$ node benchmark.js slug
slug: 67.963ms
$ node benchmark.js slugify
slugify: 25.289ms
$ node benchmark.js slugify
slugify: 24.638ms
$ node benchmark.js slugify
slugify: 25.17ms
$
Those benchmark results are with previous versions of slug
(4.04) and slugify
(1.5.0). Here are the results with current slug
(5.0.0) and current slugify
(1.5.2). Nothing has really changed:
$ node benchmark.js slug
slug: 68.873ms
$ node benchmark.js slug
slug: 64.661ms
$ node benchmark.js slug
slug: 68.128ms
$ node benchmark.js slugify
slugify: 22.94ms
$ node benchmark.js slugify
slugify: 24.832ms
$ node benchmark.js slugify
slugify: 24.323ms
$
Another difference that I am only now noticing: slug
supports several alphabets (including a few RTL ones) that slugify
does not (Arabic, Hebrew, Farsi, Hindi, possibly others). I'll probably send pull requests to add those alphabets to slugify
at some point, but first to find the source of the performance difference. (Hope it's not having those additional alphabets!)
The performance difference comes down to support for characters requiring more than one code point to represent. This is necessary for Hindi (and other languages like Sanskrit that use the Devanagari alphabet) and Hebrew. It's also necessary for emoji.
> slug.extend({'👍': 'thumbs up!' })
undefined
> slug('Hi! 👍')
'hi-thumbs-up!'
> slugify.extend({'👍': 'thumbs up!' })
undefined
> slugify('Hi! 👍')
'Hi!'
>
My thinking right now is to introduce either an option or some kind of detection in slug()
so it can bypass the slow multi-code-point processing for users that value performance over completeness. If that works well and if there's interest in Devanagari/Hebrew/emoji support in slugify
, I can look at porting that over.
I think the best approach to have this is by introducing a new option, at least here in slugify.
leading and trailing replacement only works if you're passing the replacement at start or end. but when you get the final result value and get a replacement at the start/end, then it doesn't remove those leading and trailing replacements.
Take this string as an example:
" Showcase Your Figma Designs on WordPress P2 "
Expected: Showcase-Your-Figma-Designs-on-WordPress-P2
Output: -Showcase-Your-Figma-Designs-on-WordPress-P2-
In my suggest we must use another regex replace to remove leading and trailing replacements from final results.
/^-+|-+$/
/^' + replacement + '+|' + replacement + '+$/
Something like this. What do you think? @Trott
leading and trailing replacement only works if you're passing the replacement at start or end. but when you get the final result value and get a replacement at the start/end, then it doesn't remove those leading and trailing replacements.
Take this string as an example:
" Showcase Your Figma Designs on WordPress P2 "
Expected:
Showcase-Your-Figma-Designs-on-WordPress-P2
Output:-Showcase-Your-Figma-Designs-on-WordPress-P2-
@vijayhardaha What version of slugify
are you using? The relevant fix was introduced in 1.5.2, and I'm not seeing the behavior you are describing.
$ npm ls slugify
temp@1.0.0 /Users/trott/temp
└── slugify@1.5.2
$ node
Welcome to Node.js v16.1.0.
Type ".help" for more information.
> require('slugify')(" Showcase Your Figma Designs on WordPress P2 ")
'Showcase-Your-Figma-Designs-on-WordPress-P2'
>
@Trott when the strict
option is enabled, trailing dash are not correctly removed.
slugify('Hi !', { strict: true })
// -> "Hi-"
@Trott hey I am using the latest version. To test my case you need to start and end the string with special characters.
so replace this require('slugify')(" Showcase Your Figma Designs on WordPress P2 ")
with this require('slugify')('" Showcase Your Figma Designs on WordPress P2 "')
use double quotes within the text and enable strict option;
This is happening when we use special char at the start/end with whitespace
.replace(/\s+/g, replacement)
via this line whitespace replaced via replacement so here we how this output
"-Showcase-Your-Figma-Designs-on-WordPress-P2-"
.replace(new RegExp('[^a-zA-Z0-9' + replacement + ']', 'g'), '')
via this line special chars get removed so now output is
-Showcase-Your-Figma-Designs-on-WordPress-P2-
.replace(new RegExp('[\\s' + replacement + ']+', 'g'), replacement)
then this line doesn't do anything with replacement at start/end, it just look for white spaces.@vijayhardaha slugify('" Showcase Your Figma Designs on WordPress P2 "')
results in '"-Showcase-Your-Figma-Designs-on-WordPress-P2-"'
. That seems correct to me. What value are you expecting instead? What command/settings are you using to get those results?
@vijayhardaha Ah! I see! You say you are running with the strict option. Now I understand. Fix should be coming shortly.
Proposed fix is in https://github.com/simov/slugify/pull/116 if anyone wants to test it and confirm it fixes the issue for them.
@vijayhardaha Your suggestion indeed would fix the bug, but it also involves a dynamically generated regular expression using replacement
. I know two of those already exist in the code base, but it had been my intention to get rid of them because they do not work as expected in edge case situations like replacement
having a value of '.'
. In the fix in the pull request linked above, the existing dynamically created regular expressions are removed.
Hey @Trott I ran several tests, Proposed fix in #116 solves the problem.
slugify 1.5.3 published with the fix. Thanks for reporting the issue with strict
mode!
Can removing trailing replacement characters be optional? This was a breaking change for us as we use slugify
for automatically validating and cleaning up urls being typed into a text field.
Previously, a user could type in a-url-with-replacement-characters
one character at a time in a text field and get back a-url-with-replacement-characters
.
However, since the "-"
characters will automatically get cleared when typing now (aka, when it's trailing like a-url-
), it comes out to aurlwithspaces
.
Can removing trailing replacement characters be optional? This was a breaking change for us as we use
slugify
for automatically validating and cleaning up urls being typed into a text field. Previously, a user could type ina-url-with-replacement-characters
one character at a time in a text field and get backa-url-with-replacement-characters
. However, since the"-"
characters will automatically get cleared when typing now (aka, when it's trailing likea-url-
), it comes out toaurlwithspaces
.
As a temporary (or maybe even permanent) workaround, you could add a letter to the start and end of the string and then strip it off the result:
slugify('a' + str + 'a').replace(/^a|a$/g, '')
Not the most elegant solution, but it will work.
@davidlukerice If all goes well, I'm about to add a "trim" option to the slug
module so you could use slugify(str, { trim: false })
. See https://github.com/Trott/slug/pull/230. If that looks good to you, I'll do the same thing in slugify
(barring concerns from @simov).
Configurable "trim" option (but still on by default) in slug@5.1.0
. Will open a PR for slugify
later (unless someone beats me to it, which is fine).
Just wanted to mention that it would've been nice to have this as a semver major. I recently discovered that it broke my code without me knowing. My situation is that I slugify my usernames so that user name
is considered the same name as user-name
for example. However, some users have names like -username-
(even though I don't allow it anymore now) and now they can no longer login because the slugged usernames in the database don't match.
I can fix it for some of them by simply changing the slug in my database manually, but this doesn't work when I have both username
and -username-
as separate users. The problem is limited to about 20 users on a total of 30k, so nothing too major, but I just wanted to share my experience.
Just wanted to mention that it would've been nice to have this as a semver major. I recently discovered that it broke my code without me knowing. My situation is that I slugify my usernames so that
user name
is considered the same name asuser-name
for example. However, some users have names like-username-
(even though I don't allow it anymore now) and now they can no longer login because the slugged usernames in the database don't match.I can fix it for some of them by simply changing the slug in my database manually, but this doesn't work when I have both
username
and-username-
as separate users. The problem is limited to about 20 users on a total of 30k, so nothing too major, but I just wanted to share my experience.
It doesn't take away from the breaking-change aspect, but if it helps, you can also use slugify(str, {trim: false})
to get the old behavior and not change anything in your database.
Thanks, that's what I used to solve the problem: I've put a guard in front which checks if a username starts or ends with a -
. If so I use the legacy slugify(str, {trim:false})
and check if the outcome matches the list of "affected users". If not, I run them through the slugify(str)
method as normal. Kind of hacky, but fixed it. Hopefully it can be helpful for others who had a similar issue.
With
Is it possible to allow to remove this latest dash ?