Use Levenshtein edit distance as implemented in the python-Levenshtein library to map values to the closest domain term in case an exact match is not found. Levenshtein edit distance is defined as the number of characters that need to be inserted, deleted, or substituted (i.e., the number of edit operations) to convert a string into another string. As an example, consider the inform utterance “I want a restaurant serving Spenish food”. This is a misspelling of the food preference Spanish. According to Levenshtein edit distance, this word is close to Danish (substitute S with D, delete p, substitute e with a, total distance 3), Polish (also 3), Swedish (distance 2) and Spanish (distance 1). Choosing the match with the smallest distance results in the correct value Spanish as the food preference. If two or more words have equal distance then you can choose at random, if no word is close (distance 3 or less) then the keyword matching should fail and the system should re-ask the preference with a suitable error message. Note that there are several alternatives for the python-Levenshtein library, it is allowed to use these libraries (or copy a reference implementation directly, provided that you mention the source of the code as a comment) as long as they implement the default Levenshtein edit distance.
The system utterances can be generated based on templates, either a simple string or a pattern where variables can be inserted.
Examples:
String: "askpricerange": "Which price range do you prefer?"
Pattern: "confirmfoodtype": "I did not recognize {givenfoodtype}. Did you mean {correctedfoodtype}?"
Use Levenshtein edit distance as implemented in the python-Levenshtein library to map values to the closest domain term in case an exact match is not found. Levenshtein edit distance is defined as the number of characters that need to be inserted, deleted, or substituted (i.e., the number of edit operations) to convert a string into another string. As an example, consider the inform utterance “I want a restaurant serving Spenish food”. This is a misspelling of the food preference Spanish. According to Levenshtein edit distance, this word is close to Danish (substitute S with D, delete p, substitute e with a, total distance 3), Polish (also 3), Swedish (distance 2) and Spanish (distance 1). Choosing the match with the smallest distance results in the correct value Spanish as the food preference. If two or more words have equal distance then you can choose at random, if no word is close (distance 3 or less) then the keyword matching should fail and the system should re-ask the preference with a suitable error message. Note that there are several alternatives for the python-Levenshtein library, it is allowed to use these libraries (or copy a reference implementation directly, provided that you mention the source of the code as a comment) as long as they implement the default Levenshtein edit distance. The system utterances can be generated based on templates, either a simple string or a pattern where variables can be inserted. Examples: String: "askpricerange": "Which price range do you prefer?" Pattern: "confirmfoodtype": "I did not recognize {givenfoodtype}. Did you mean {correctedfoodtype}?"