Colin-Codes / IntentClassifier-ML-Project

Pyhton, Keras, SciKit-Learn, Matplotlib: Machine learning research project around classification of intent behind tech support emails in order to enable automatic follow up.
0 stars 0 forks source link

Classification and labelling #23

Closed Colin-Codes closed 5 years ago

Colin-Codes commented 5 years ago

Having anonymised the data, convert to CSV for labelling

Colin-Codes commented 5 years ago

Availability, - delivery date query Delivery, - amend the delivery date Pricing, - price check / breakdown Weight, - frame / glass weight queries Gables, - enable the gables tab - if no customer ID then remind that gables needs enabling Template, - OSS template changes Authorisation, - wants more priveleges in EasiAdmin Error, - reporting an error in any system Documents, - Can't download documents in OSS EqualGlass, - Equal glass selected not resulting in equal glass sizes. Account, - setting up or modifying accounts (maybe don't respond if there are more than 1 emails or customer IDs). Check for logo, ordering and accounts. check for delete / remove. Sometimes is a request for access. Project, - Issues accessing projects on OSS Feedback - Suggestions for future improvement of OSS Colour, - new colours to be added on OSS Callback, - BDM wants a call for support - HAL to ignore. Report - request to generate a report Reminder, - request for updates Status, - requests to amend or investigate order status Logo, - request to add a logo to OSS Leaver - A user has left the company Access - EA password reset Information - requests for clairfication Forward - FYI requiring no further action

Added Admin for small data changes Action for issue worthy concerns

Note that Action, Feedback, Error and Information are all quite similar really.

Colin-Codes commented 5 years ago

Review individual cases of Error and Miscellaneous for sub-divisions and split them out if necessary.

Check numbers of smaller groups, merge if necessary.

Colin-Codes commented 5 years ago

Complex classifications can prove difficult. In reality however this would be caught by inviting a response if there were outstanding actions required.

Is there an argument for removing poor examples? Particularly if they are very unique. But how do you quantify this? Maybe this is fine at larger amounts of data.

Consider two new classes:

request for improvement and limitations

Colin-Codes commented 5 years ago

Could have used K-means clustering to group requests...

Colin-Codes commented 4 years ago

Must remove small classifications to improve cross validation results