Archive vocab size, for use when creating embedding layer. Transient issue can occur where the maximum vocab index isn't seen in the training data set, and so the embedding vectorizer has a larger vocab than the embedding matrix.
Current state
Embedding matrix pulls vocab size based on largest vocab index seen in training set
Future state
Embedding matrix pulls vocab size from either transformation pipeline or somewhere else that is explicitly set by transformation pipeline
Archive vocab size, for use when creating embedding layer. Transient issue can occur where the maximum vocab index isn't seen in the training data set, and so the embedding vectorizer has a larger vocab than the embedding matrix.
Current state
Future state