gabrielpreda / Support-Tickets-Classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
MIT License
3 stars 6 forks source link
classification-model countvectorizer flask lightgbm naive-bayes-classifier natural-language-processing numpy pandas predictive-modeling python scikit-learn support-vector-machines tfidf-text-analysis

Table of contents

  1. Project description
  2. Results and learnings
    2.1. Main challenge and initial assumptions
    2.2. Dataset
    2.3. Training and evaluation results
    2.4. Model deployment and usage
  3. Run the example
    3.1. Prerequisites
    3.2. Train and evaluate the model
    3.3. Deploy web service
  4. Code highlights


1. Project description

[back to the top]

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets.
This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava.
Our combined team tried 3 different approaches to tackle this challenge using:

In this repository we will focus only on AML Workbench and Python scripts used to solve this challenge.

What will you find inside:

The team:


2. Results and learnings

[back to the top]

Disclaimer: This POC and all the learnings you can find bellow is an outcome of close cooperation between Microsoft and Endava. Our combined team spent total of 3 days in order to solve a challenge of automatic support tickets classification.

2.1. Main challenge and initial assumptions

[back to the top]


2.2. Dataset

[back to the top]


2.3. Training and evaluation results

[back to the top]

In order to train our models, we used AML Workbench and Azure Machine Learning Services to run training jobs with different parameters and then compare the results and pick up the one with the best values.:

To train models we tested 2 different algorithms: SVM and Naive Bayes. In both cases results were pretty similar but for some of the models, Naive Bayes performed much better (especially after applying hyperparameters) so at some point we decided to work with NB only.

Below you can find some of the results of models we trained to predict different properties:


2.4. Model deployment and usage

[back to the top]

Final model will be used in form of a web service running on Azure and that's why we prepared a sample RESTful web service written in Python and using Flask module. This web service makes use of our trained model and provides API which accepts email body (text) and returns predicted properties.

You can find a running web service hosted on Azure Web Apps here: https://endavaclassifiertest1.azurewebsites.net/.
The project we based our service on with code and all the deployment scripts can be found here: karolzak/CNTK-Python-Web-Service-on-Azure.

Sample request and response in Postman: Demo


3. Run the example

3.1. Prerequisites

[back to the top]

3.2. Train and evaluate the model

[back to the top]

To train the model you need to run 2_train_and_eval_model.py script. There are some parameters you could possibly play around with - check out code highlights section for more info.

3.3. Deploy web service

[back to the top]

Inside webservice folder you can find scripts to setup a Python based RESTful web service (made with Flask module).

Deeper in that folder you can also find download_models.py script which can be used to download some already trained models that will be used by the web service app.

In order to deploy it to an environment like Azure App Service you can check this GitHub repo for some inspiration.


4. Code highlights

[back to the top]