Build a WORKSFORME model (hashtag3760)

Implementation ideas

To implement the WORKSFORME model, follow these steps:

Define the features to be used by the model based on the ticket description:
- Time since the last comment
- Open needinfo on the reporter
- Severity
- Status flags
Extract the relevant features from the bug data:
- Use bugbug/bug_features.py to define new features if needed.
- Modify existing features or create new classes inheriting from SingleBugFeature to extract the required information.
Prepare the dataset:
- Use bugbug/bugzilla.py to fetch bugs that have been closed as WORKSFORME.
- Extract the features for each bug using the defined feature classes.
Train the model:
- Use bugbug/models/fixtime.py as a reference to create a new model class, e.g., WORKSFORMEModel, inheriting from BugModel.
- Define a pipeline using sklearn.pipeline.Pipeline with a ColumnTransformer and DictVectorizer to handle categorical features.
- Choose a suitable machine learning algorithm (e.g., xgboost.XGBClassifier) and add it to the pipeline.
- Split the dataset into training and testing sets.
- Train the model on the training set and evaluate its performance on the testing set.
Implement the prediction functionality:
- Add a method to the WORKSFORMEModel class to predict the probability of a bug being WORKSFORME.
- Use the trained model to predict on new or existing bugs.
Create a page or sheet for each team:
- Use ui/changes/src/common.js to create a new UI component or page that lists bugs with their probability of being WORKSFORME.
- Sort the bugs by their probability score.
- Ensure that the page can filter bugs by team.
Integrate the model into the workflow:
- Automate the process of running the model periodically or upon certain triggers (e.g., when a bug is updated).
- Provide teams with access to the generated page or sheet.

Here is a high-level pseudo-code outline for the WORKSFORMEModel:

from bugbug.model import BugModel
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.feature_extraction import DictVectorizer
import xgboost

class WORKSFORMEModel(BugModel):
    def __init__(self):
        # Define the features to be used by the model
        self.features = [
            bug_features.TimeSinceLastComment(),
            bug_features.OpenNeedinfo(),
            bug_features.Severity(),
            bug_features.StatusFlags(),
            # Add other relevant features
        ]

        # Define the pipeline
        self.pipeline = Pipeline([
            ('transformer', ColumnTransformer([
                ('vectorizer', DictVectorizer(), self.features),
            ])),
            ('classifier', xgboost.XGBClassifier()),
        ])

    def train(self, bugs):
        # Prepare the dataset
        X = self.extract_features(bugs)
        y = [1 if bug['resolution'] == 'WORKSFORME' else 0 for bug in bugs]

        # Split the dataset
        X_train, X_test, y_train, y_test = train_test_split(X, y)

        # Train the model
        self.pipeline.fit(X_train, y_train)

        # Evaluate the model
        score = self.pipeline.score(X_test, y_test)
        print(f'Model accuracy: {score}')

    def predict(self, bugs):
        # Predict the probability of being WORKSFORME
        X = self.extract_features(bugs)
        probabilities = self.pipeline.predict_proba(X)
        return probabilities

Remember to handle data preprocessing, feature selection, model evaluation, and hyperparameter tuning to improve the model's accuracy.

Code snippets to check

bugbug → models → fixtime.py

1. `Lines 1 - 21` This snippet is from a model file that could be adapted to predict the WORKSFORME status. It shows the structure of a model which could be a starting point for the new model. https://github.com/Mayil-AI/bugbug-21dec23/blob/0acd00da46afbb37bd047c0bce06ce7cfad21568/bugbug/models/fixtime.py#L1-L21

bugbug → bug_features.py

1. `Lines 299 - 684` This snippet contains features that are used in models to predict bug statuses. These features could be relevant for the new WORKSFORME model. https://github.com/Mayil-AI/bugbug-21dec23/blob/0acd00da46afbb37bd047c0bce06ce7cfad21568/bugbug/bug_features.py#L299-L684

bugbug → bugzilla.py

1. `Lines 439 - 531` This snippet shows how data is fetched and processed from Bugzilla, which could be useful for gathering the data needed to train the WORKSFORME model. https://github.com/Mayil-AI/bugbug-21dec23/blob/0acd00da46afbb37bd047c0bce06ce7cfad21568/bugbug/bugzilla.py#L439-L531

Mayil-AI / bugbug-21dec23

Build a WORKSFORME model (hashtag3760) #19

Implementation ideas

Code snippets to check