Open maxime4000 opened 1 year ago
Do you refer to something like this?
function exponentialDistributionNumber(start = 1, stepScale = 2, stepProbability = 0.5, limit = Number.MAX_SAFE_INTEGER) {
let max = start;
while(faker.datatype.boolean(stepProbability) && max < limit) {
max *= stepScale;
}
return faker.number.int({ min: 0, max: Math.min(max, limit) });
}
exponentialDistributionNumber(1, 2, 0.5, 100)
Would something like this suffice or do you need more/something else?
Interesting! Yes something like this would suffice. That would be nice if it was implemented as an API function.
Something similar could also be achieved by having a variant of faker.helpers.arrayElement where each element of the array has a fixed independent probability of being included in the return values
Something similar could also be achieved by having a variant of faker.helpers.arrayElement where each element of the array has a fixed independent probability of being included in the return values
Like helpers.weightedArrayElement? Well not really but close when used for the length.
Team decision
There is an existing workaround for this problem. We are currently unsure about implementation details regarding the distribution.
If you want/need this feature please upvote this issue.
Thank you for your feature proposal.
We marked it as "waiting for user interest" for now to gather some feedback from our community:
If you would like to see this feature be implemented, please react to the description with an up-vote (:+1:).
If you have a suggestion or want to point out some special cases that need to be considered, please leave a comment, so we are aware about them.
We would also like to hear about other community members' use cases for the feature to give us a better understanding of their potential implicit or explicit requirements.
We will start the implementation based on:
the number of votes (:+1:) and comments
the relevance for the ecosystem
availability of alternatives and workarounds
and the complexity of the requested feature
We do this because:
There are plenty of languages/countries out there and we would like to ensure that every method can cover all or almost all of them.
Every feature we add to faker has "costs" associated to it:
Here an improved version of the function:
/**
* Generates a random number between min and max using an exponential distribution.
* The lower bound is inclusive, but the upper bound is exclusive.
*
* @param options The options for generating the number.
* @param options.min The minimum value to generate. Defaults to `0`.
* @param options.max The maximum value to generate. Defaults to `1`.
* @param options.bias The bias of the distribution. Must be greater than 0. Defaults to 1.
* The lower the bias, the more likely the number will be closer to the min (0-1@0.1 -> avg: ~0.025).
* A bias of 1 will generate the default exponential distribution (0-1@1 -> avg: ~0.202).
* The higher the bias, the more likely the number will be closer to the max (0-1@10 -> avg: ~0.691).
*
* @throws If bias is less than or equal to 0.
* @throws If max is less than min.
*/
function exponentialDistributionNumber(
options:
| number
| {
/**
* The minimum value to generate.
*
* @default 0
*/
min?: number;
/**
* The maximum value to generate.
*
* @default 1
*/
max?: number;
/**
* The bias of the distribution. Must be greater than 0.
*
* The lower the bias, the more likely the number will be closer to the min (0-1@0.1 -> avg ~0.025).
* A bias of 1 will generate the default exponential distribution (0-1@1 -> avg ~0.202).
* The higher the bias, the more likely the number will be closer to the max (0-1@10 -> avg ~0.691).
*
* @default 1
*/
bias?: number;
}
) {
if (typeof options === 'number') {
options = { max: options };
}
const { min = 0, max = 1, bias = 1 } = options;
if (bias <= 0) {
throw new FakerError('Bias must be greater than 0');
}
if (max === min) {
return min;
}
if (max < min) {
throw new FakerError(`Max ${max} should be greater than min ${min}.`);
}
const random = faker.number.float(); // [0,1)
const exponent = random ** (1 / bias); // [0,1)
const range = max - min + 1; // +1 to account for x ** 0 = 1
return min + range ** exponent - 1; // -1 to account for x ** 0 = 1
}
Generating 100kk values between 0-100:
Clear and concise description of the problem
So I'm seeding a database with faker. I have field that allow array of some type. I want to generate multiple array, but with different size. Some where the array is empty, some where the array has 1 elements and some where the array has multiple elements.
Most of the case will have one element in the array, but I also want to test limit case, so having a way to generate Random distributed data would be nice.
Let's said that I'm faking an array of value and I want some length to be more common than others. It's common to have an array of length 1 to 3 but it's very rare to have an array of 100. I would like to have a random probability distribution function for this.
Suggested solution
In my case, I'm looking for a random exponential distribution.
The function would accept an argument like this:
And would generate a number using the distribution called. I would expect to call faker.random.exponentialDistribution({min: 0, max: 100, curveSettings: {...}}) and the number generated from this would have more chance to be closer to 0 than closer to 100. On a scale of 1000 random value generated, we could see few value with a number close to 100.
I wouldn't limit the feature to only exponential distribution, I would also add gaussian distribution, Rayleigh distribution, gamma distribution, etc...
Alternative
No response
Additional context
I'm not sure if what I'm asking is out of scope for faker, but at the same time, faker is generating data from a random value. Why would faker couldn't generate number base on some probability of that number to be generated?
Btw, I'm no mathematician, so I might be incorrect with what I explain, but I still think faker could add some random probability distribution function.