faker-js / faker

Generate massive amounts of fake data in the browser and node.js
https://fakerjs.dev
Other
12.98k stars 919 forks source link

Add method to return random languages (ISO-639 data) #2622

Open jeremyhofer opened 9 months ago

jeremyhofer commented 9 months ago

Clear and concise description of the problem

I am currently working on a project where I would like to return a random ISO-639 code in mock data. It would be great for faker to support this internally similar to location.countryCode

Suggested solution

I'm not sure which module it may fit in well today, or if a new module may make the most sense, but my proposal would be to add a language or languageCode method within faker. The method would take as input a variant based on the ISO-639 standard, defaulting to ISO-639-1 (2 character) codes when called without parameters.

Link to the standard: https://www.loc.gov/standards/iso639-2/php/code_list.php

The variants I propose to be implemented are:

  1. 639-1 - to return 639-1 2 character codes
  2. 639-2 - to return 639-2 3 character codes

The additional sets - 639-3, 639-4, 639-5 - may also be implemented. Similarly, the English, French, and German language names as defined in the standard could be added as variants.

Alternative to the specific sets, an alpha-2 and alpha-3 variant approach could be taken where alpha-2 would be the 639-1 code set and alpha-3 may be the 639-2 set.

Alternative

No response

Additional context

No response

github-actions[bot] commented 9 months ago

Thank you for your feature proposal.

We marked it as "waiting for user interest" for now to gather some feedback from our community:

matthewmayer commented 9 months ago

compare #1548 (requesting full language names)

ST-DDT commented 9 months ago

I would probably put it in the location module.

Also we should probably close #1548 (as superseded by this), as this has a better description.

matthewmayer commented 9 months ago

The codes like "en" are not locale dependent so they should be in fakerbase

The language names are locale dependent however. So it would probably be two methods?

faker.location.languageCode() and faker.location.language()

ST-DDT commented 9 months ago

I thought about using something like [{ name: English, alpha2: EN, alpha3: ENG }, { name: German, alpha2: DE, alpha3: DEU }].

As for the base I though about using the languages own name as the name there, but not sure about that. [{ name: English, alpha2: EN, alpha3: ENG }, { name: Deutsch, alpha2: DE, alpha3: DEU }]

Reason being, that some language selection dropdowns show the name, but internally use the code in their data.

matthewmayer commented 9 months ago

I'd still make it two methods for consistency with .country() and .countryCode()

Anyway I agree with closing #1548 to keep discussion and upvotes in one place

ST-DDT commented 9 months ago

I added this to our meeting notes for discussion:

matthewmayer commented 9 months ago

Note that only a small number of languages have a 2 character code. Many more have a three character code.

ST-DDT commented 9 months ago

Preliminary Team Decision