skfarhad / hazard_reporting_system

BSD 3-Clause "New" or "Revised" License
15 stars 16 forks source link

Issue#40 - Add SMS pharser with autocorrection feature. #51

Closed hafiz-bs23 closed 2 months ago

hafiz-bs23 commented 2 months ago

Issue link: issue 40

Summery: Created a new service for phrasing SMS string to take input in the form of district#thana#area#message and give the output of a json with auto-corrected district name, thana name, and separate area and message key. pharser.pharse_full(message) will phrase the whole message pharser.pharse_location(message) will phrase only the location

Prerequisites: need to add spello in requirements.txt

Implementation brief: spello library is used for auto-correction. It takes the keywords as input make a model that can be re-used from memory. district model is used to correct only the district name. All the district have own model used for correction the thana name under that district.

Data used to train the models collected from BD Govt. Disaster Management site. They are also present in PR. In future any data sourse can be used.

Training model is easy. Also trained model are added in the PR for plug and play. Data can me modified and models can be retrained in any time.

Some test are added for the proof of concept.

Check the diagram below for better understanding image