RFW0114: Recreate the benchmark dataset, ensuring a more balanced distribution across all departments
Summary
To recreate a benchmark dataset with a more even distribution within departments, specifically considering genders, ages, and education qualifications.
Key Concepts
Benchmark dataset: The dataset is used as a reference point for performance evaluation.
Context
The current benchmark dataset lacks representation and balances across various demographic groups within departments. This can lead to biased evaluations and inaccurate performance assessments.
Rebalancing the dataset with these factors in mind will ensure fairer and more reliable evaluations.
Outputs
A new benchmark dataset with a significantly more even distribution of data points across departments, considering gender, age, and education qualification.
We aim for ~10k samples in the benchmark with equal distribution across all the 5 departments. Each department needs 2k examples and even within departments, we need even distribution among the categories.
Inputs
Existing benchmark dataset.
Information on desired distribution percentages, and demographic breakdowns within each department.
Timeline
Specify the expected delivery date for the project.
References
Include any relevant links or resources for additional context or information.
RFW0114: Recreate the benchmark dataset, ensuring a more balanced distribution across all departments
Summary
To recreate a benchmark dataset with a more even distribution within departments, specifically considering genders, ages, and education qualifications.
Key Concepts
Benchmark dataset: The dataset is used as a reference point for performance evaluation.
Context
The current benchmark dataset lacks representation and balances across various demographic groups within departments. This can lead to biased evaluations and inaccurate performance assessments. Rebalancing the dataset with these factors in mind will ensure fairer and more reliable evaluations.
Outputs
A new benchmark dataset with a significantly more even distribution of data points across departments, considering gender, age, and education qualification.
We aim for ~10k samples in the benchmark with equal distribution across all the 5 departments. Each department needs 2k examples and even within departments, we need even distribution among the categories.
Inputs
Existing benchmark dataset. Information on desired distribution percentages, and demographic breakdowns within each department.
Timeline
Specify the expected delivery date for the project.
References
Include any relevant links or resources for additional context or information.