faker-js / faker

Generate massive amounts of fake data in the browser and node.js
https://fakerjs.dev
Other
12.49k stars 897 forks source link

Weight genders in `faker.person.gender()` #1730

Open matthewmayer opened 1 year ago

matthewmayer commented 1 year ago

Clear and concise description of the problem

faker.person.gender() pulls from a list of genders at https://github.com/faker-js/faker/blob/next/src/locales/en/person/gender.ts

This is a very inclusive list, however I feel it makes the function less realistic to have all values returned with equal probability. In reality, even in fairly LGBTQ-friendly countries, the percentage of people who define as non-cis-gender is fairly low https://www.bbc.com/news/uk-64184736

Suggested solution

Now that we have faker.helpers.weightedArrayElement i think we could change this so for example it returns "Man" 45% of the time, "Woman" 45% of the time, and one of the other genders 10% of the time (divided equally between the other 73 options)

Alternative

Keep current behavior (unrealistic) Only return Man or Woman (non-inclusive)

Additional context

No response

Shinigami92 commented 1 year ago

-> Keep current behavior (unrealistic)

When someone already calls the method, they intend to get just some random data for e.g. testing an input like a gender free text field on Facebook or whatever platform.

It is not meant to represent realistic data neither IMO Faker is not meant to be a real world database for everything.

Implementing weight into all and everything inside Faker is a non goal to me.

matthewmayer commented 1 year ago

I guess it depends what you mean by "but realistic" in the tagline:

Generate massive amounts of fake (but realistic) data for testing and development.

i think in some cases like the name patterns, adding weights makes the data more realistic.

ST-DDT commented 1 year ago

I agree, that we should somewhat balance the genders. However I would keep it simple and just do 90% any binary gender and 10% non-binary/any genders, hardcoded in the method and not in the data.

ST-DDT commented 1 year ago

Workaround:

faker.datatype.boolean({ probability }) ? faker.person.sex() : faker.person.gender()

If you are interested in this feature, please upvote it.

github-actions[bot] commented 1 year ago

Thank you for your feature proposal.

We marked it as "waiting for user interest" for now to gather some feedback from our community: