This PR addresses issue #40, being able to search the models by the label associated with a concept ID and not the concept ID itself.
To allow for fast lookups over a fairly large table (~1GB, with 39,446,070 rows), the concept labels and IDs are inserted into a pygtrie CharTrie, which is then exposed as a global so that the /autocomplete endpoint can use it, much like it currently uses the vocabulary trie. Building this trie can take a long time, so the trie-loading code first looks for an existing pickled version of the trie at data_folder / concept_trie.pkl and, if it can't find it, generates and then pickles the trie to that location. On my M1 Mac Pro, generating the trie takes ~45 minutes, whereas loading it takes ~12 minutes.
Since the /autocomplete endpoint now returns two types of results, vocabulary entries and concept map entries, the returned format has been changed from a list of matching terms to a dict with the following form:
This PR addresses issue #40, being able to search the models by the label associated with a concept ID and not the concept ID itself.
To allow for fast lookups over a fairly large table (~1GB, with 39,446,070 rows), the concept labels and IDs are inserted into a pygtrie CharTrie, which is then exposed as a global so that the
/autocomplete
endpoint can use it, much like it currently uses the vocabulary trie. Building this trie can take a long time, so the trie-loading code first looks for an existing pickled version of the trie atdata_folder / concept_trie.pkl
and, if it can't find it, generates and then pickles the trie to that location. On my M1 Mac Pro, generating the trie takes ~45 minutes, whereas loading it takes ~12 minutes.Since the
/autocomplete
endpoint now returns two types of results, vocabulary entries and concept map entries, the returned format has been changed from a list of matching terms to a dict with the following form:This PR depends on https://github.com/greenelab/word-lapse-models/pull/6.
Closes #40.