mozilla / bugbug

Platform for Machine Learning projects on Software Engineering
Mozilla Public License 2.0
503 stars 311 forks source link

Evaluate generating multiple component models #224

Open marco-c opened 5 years ago

marco-c commented 5 years ago

One model for choosing the product, one model to choose the "conflated component", one model to choose the component.

This might help increasing the accuracy (e.g. it's easier to tell that a bug belongs to Core::DOM rather than Core::DOM: Quota Manager).

yixinsun commented 5 years ago

I am thinking of something like 'hierarchical classification'. Did a bit of research and there are 2 ideas: 1) stacking: in my understanding, this is to stack classifiers at different levels paper 2) Hierarchical neural network: model learned from the maximum posterior paper

Are you thinking about something closer to the stacking model?

marco-c commented 5 years ago

Yes.

We have: Level 1: Product (e.g. Core, Firefox, Toolkit, etc.) Level 2: Component Group (e.g. Core::DOM, Core::JavaScript, etc.) Level 3: Component

marco-c commented 5 years ago

We could also try https://scikit-learn.org/stable/modules/multiclass.html#multioutput-classification or https://scikit-learn.org/stable/modules/multiclass.html#classifier-chain.

suhaibmujahid commented 4 months ago

See https://github.com/mozilla/bugbug/issues/4172

It appeared to work better when we implemented that for Fenix.