[AI] Evaluate and Optimize BERT Using the Custom Dataset

Objective

Develop code to evaluate the BERT topic modeling model using the custom dataset. Iteratively improve the model to align its output with our predefined topics, enhancing its accuracy and effectiveness.

Description

Evaluation Goals:
- Assess the model's ability to correctly identify the predefined topics.
- Measure performance in distinguishing overlapping topics.
- Identify areas where the model deviates from expected results.
Optimization Goals:
- Fine-tune model parameters and preprocessing steps.
- Improve topic coherence and accuracy.
- Reduce misclassifications and enhance model robustness.

Steps to Follow

Prepare the Evaluation Environment:
- Ensure the custom dataset is properly formatted and accessible.
- Set up the necessary libraries and dependencies for running the BERT model if you add more parameters.
Initial Model Evaluation:
- Run the BERT topic modeling model on the custom dataset.
- Collect the model's output, including assigned topics for each message.
Calculate Simple Accuracy
- Compare Predictions:
  - For each message, note if the predicted topic matches the actual topic.
- Compute Accuracy: Accuracy = (Number of Correct Assignments / Total Number of Messages) × 100%
  - Example: If 13 out of 15 messages are correctly assigned: Accuracy = (13 / 15) × 100% ≈ 86.7%
Assess Topic Coherence Visually
- Review Grouped Messages:
  - For each topic, read through the assigned messages.
  - Determine if the messages share a common theme or concept.
- Identify Issues:
  - Note any topics where messages seem out of place or unrelated.
Analyze Results:
- Identify patterns in misclassified messages.
- Evaluate how well the model handles overlapping topics.
- Determine if certain topics are consistently confused.
Model Optimization:
- Adjust Preprocessing:
  - Experiment with different text preprocessing techniques (e.g., stemming vs. lemmatization).
  - Update stop-word lists or add domain-specific stop words.
- Tune Model Parameters:
  - Modify the number of topics.
  - Adjust hyperparameters like learning rate, batch size, or embedding dimensions.
Iterative Testing:
- Rerun the model after each adjustment.
- Document changes and their impact on evaluation metrics.
- Continue iterating until performance meets predefined thresholds.
Document Findings and Recommendations:
- Summarize the optimization process and results in your PR.
- Highlight key improvements and remaining challenges.
- Provide recommendations for further enhancements (do we need to work on improving or creating new testing data set etc).

Validation Criteria

Improved Accuracy: Achieve a target accuracy (e.g., above 85%) in topic classification.
High Topic Coherence: Coherence scores should indicate meaningful and distinct topics.
Effective Overlap Handling: The model correctly assigns messages with overlapping topics.
Performance Metrics Documented: All evaluation metrics are recorded and analyzed.

Expected Deliverables

Codebase:
- Scripts for running the model, evaluation, and optimization.
- Well-documented code with comments explaining each step.
Evaluation Report:
- Detailed analysis of initial and final model performance.
- Tables and graphs illustrating improvements.
- Explanation of the impact of each optimization step.

BoredLabsHQ / Concord

[AI] Evaluate and Optimize BERT Using the Custom Dataset #16

Objective

Description

Steps to Follow

Validation Criteria

Expected Deliverables