Open Linchenpal opened 4 weeks ago
Imbalance Analysis: The imbalance analysis for all datasets has been performed and is available in my branch feature/lina_texts_playground. The analysis includes class distribution summaries and visualizations for each dataset to better understand the class imbalance present.
[x] Perform Imbalance Analysis: Analyze and report class distributions for each dataset.
[ ] Handle Missing Data: Identify and handle any missing values using appropriate imputation methods.
[ ] Normalize/Standardize Features: Apply normalization or standardization to ensure consistent feature scaling
[ ] Encode Categorical Variables: Convert categorical variables into numerical values using encoding methods like one-hot or label encoding.
[ ] Feature Selection/Extraction: Select or extract relevant features that contribute to the classification task, reducing dimensionality if needed.
[ ] Data Splitting: Split the dataset into training, validation, and test sets to prepare for modeling.