Closed ogallagher closed 4 years ago
An issue with scored lookups is that now, instead of leveraging the hash map structure of the dictionary for quick entry retrieval, Terry needs to search multiple (if not all) entry keys, score edit distances, and return the entry with the best score.
If on every typo Terry has to search all the keys in the hash map to compare edit distance, instruction parsing will slow down a lot, though this step could be parallelized.
I finished a first draft of an editDistance()
method, currently used for dictionary lookups for initial language mappings during the instruction classification step.
The problem I see here is that when resolving subsequent tokens for possible followers in a given set of mappings, this currently requires exact match. So, in the future I should extend the use of edit distance to resolving a token against followers.
Use of edit distance for follower token resolution is now implemented.
Unfortunately, the trained deepspeech model I’m using for the scribe is not as accurate as I’d like, so when the instruction classifier is parsing tokens and tries to match them to dictionary entries, typos in the transcription make it unlikely to find a perfect match. I think this is largely due to the fact that the model I’m using was trained on casual phone conversation, which has a much larger and different vocabulary than what Terry would expect to hear from user instructions. Therefore, very slight variations in enunciation can result in transcription to very different words, given the vast range of the output set.
To fix this issue of a generic and vast transcription vocabulary, I can either:
I plan to take the second approach; working with faulty transcriptions and creating scored dictionary lookups based on word edit distance. I’ll use the Wagner-Fischer algorithm.