faker-js / faker

Generate massive amounts of fake data in the browser and node.js
https://fakerjs.dev
Other
12.83k stars 915 forks source link

Korean faker giving invalid urls #3251

Open olafurw opened 20 hours ago

olafurw commented 20 hours ago

Pre-Checks

Describe the bug

When using fakerKO and calling internet.url() you get urls that look like https://--.com

Minimal reproduction code

https://stackblitz.com/edit/faker-js-demo-i7vsqq?file=index.ts

Additional Context

No response

Environment Info

System:
    OS: Linux 5.0 undefined
    CPU: (6) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
    Memory: 0 Bytes / 0 Bytes
    Shell: 1.0 - /bin/jsh
  Binaries:
    Node: 18.20.3 - /usr/local/bin/node
    Yarn: 1.22.19 - /usr/local/bin/yarn
    npm: 10.2.3 - /usr/local/bin/npm
    pnpm: 8.15.6 - /usr/local/bin/pnpm
  npmPackages:
    @faker-js/faker: ^9.2.0 => 9.2.0

Which module system do you use?

Used Package Manager

npm

xDivisionByZerox commented 20 hours ago

I've confirmed this bug.


The problem is in the helpers.slugify (internet.url => internet.domainName => internet.domainWord => helpers.slugify) function which is stripping korean characters from the final result:

https://github.com/faker-js/faker/blob/69173a36ed2854712cc27ddceed9a92bacf48336/src/modules/helpers/index.ts#L218-L224

matthewmayer commented 12 hours ago

this affects fa, ko and zh_CN (which has Chinese adjectives but not nouns)

af_ZA avaricious-quinoa
ar long-term-bug
az excellent-language
cs_CZ handy-wombat
da usdvanlig-mulighed
de problemlos-alkalimetall
de_AT einzig-schwimmen
de_CH unternehmungslustig-erfinder
dv clumsy-summer
el unsung-overheard
en velvety-surface
en_AU major-sunbeam
en_AU_ocker slimy-bowler
en_BORK cooperative-giant
en_CA burly-intellect
en_GB rusty-tuber
en_GH male-someplace
en_HK simple-effector
en_IE ecstatic-saloon
en_IN empty-harp
en_NG stale-replacement
en_US extroverted-publication
en_ZA miserable-precedent
eo flashy-decongestant
es scaly-phrase
es_MX glorious-festival
fa -
fi stormy-bakeware
fr super-camarade
fr_BE lache-membre-de-lequipe
fr_CA sedentaire-patientele
fr_CH sincere-collegue
fr_LU calme-membre-titulaire
fr_SN espiegle-antagoniste
he regular-giggle
hr rectangular-violin
hu alnok-szittyopazsit
hy nervous-disappointment
id_ID standard-digestive
it polite-role
ja rectangular-lawmaker
ka_GE plain-hammock
ko -
lv shy-meadow
mk only-developing
nb_NO ujevn-mom
ne robust-fireplace
nl clear-cinema
nl_BE strict-thread
pl warped-reservation
pt_BR strange-drug
pt_PT jaunty-procurement
ro disloyal-stall
ro_MD those-ecliptic
ru grim-status
sk oddball-kettledrum
sr_RS_latin fantastic-premier
sv everlasting-fort
th indolent-knitting
tr lumpy-fisherman
uk content-amendment
ur fond-hyphenation
uz_UZ_latin warmhearted-couch
vi grown-overheard
yo_NG taut-scenario
zh_CN -sideboard
zh_TW pure-help
zu_ZA eminent-cd