Currently the classification for non-agri queries is lenient. A lot of queries that are non-agri are being classified as agri and the bot starts responding. We need to train our BERT classifier with a large number of non-agri queries such that the identification of non-agri queries is more clear.
Micro tasks to be done
[ ] Create synthetic data of 700-800 non-agri queries (out of which 50-100 will be queries with agricultural terms in them)
[ ] Evaluate & benchmark the current classifier (without training)
[ ] Train the classifier against the dataset
[ ] Test the same queries and measure the accuracy (post training)
Objective
Currently the classification for non-agri queries is lenient. A lot of queries that are non-agri are being classified as agri and the bot starts responding. We need to train our BERT classifier with a large number of non-agri queries such that the identification of non-agri queries is more clear.
Micro tasks to be done