bert = AlbertModel(albert_base_configuration)
bert = MultiTaskModel(bert,
num_vocab=max([m['token'] for m in token_maps.values()]),
num_tokens=config['model_params']['vocab_size'],
hidden_size=config['model_params']['hidden_size'])
When you try to get the maximum embedding scalar value from the list: ([m['token'] for m in token_maps.values()], you might need to add the number 1 to it.
Let me explain why I think so. Imagine we would have 3 embedding scalar values or embedding ids, which would be [0,1,2] --> the length of this array would be 3, however, the max of this list would be 2. From what I can tell, the num_vocab should be chosen as 3 and not as 2, thus leng(list_Of_Embedding_Scalar_Values) should be preferred over max(list_Of_Embedding_Scalar_Values)
see: https://github.com/yl4579/PL-BERT/blob/main/train.ipynb
When you try to get the maximum embedding scalar value from the list: ([m['token'] for m in token_maps.values()], you might need to add the number 1 to it.
Let me explain why I think so. Imagine we would have 3 embedding scalar values or embedding ids, which would be [0,1,2] --> the length of this array would be 3, however, the max of this list would be 2. From what I can tell, the num_vocab should be chosen as 3 and not as 2, thus leng(list_Of_Embedding_Scalar_Values) should be preferred over max(list_Of_Embedding_Scalar_Values)