Open leezu opened 5 years ago
Another place where the script can improve is that the ngram index should be maintained by vocab too. Currently this is done through a separate dictionary: https://github.com/dmlc/gluon-nlp/commit/f48e12cb42de447768daa6f8feec1c9a06995e62#diff-4cb70aead1f1e807229a07fa8bb17a6eR107
As discussed by @szhengac https://github.com/dmlc/gluon-nlp/pull/529#discussion_r255815817, the classification script does not follow the paper. No word-ngram hashing is used.