Closed KevinEloff closed 3 years ago
We should have a test at the beginning of the function to be sure that all sequences have the same length. If sequences have different lengths, it will raise an error.
Checks added to compute_embeddings
function. Now when pool_mode is full, all items in sequence_list must be the same length.
Add option to select "full" as a
pool_mode
for ProtBert embeddings.The "full" option returns the full sequence of embeddings, rather than a reduced version using mean, cls, max, etc. The returned shape when using
pool_mode=["full"]
is(num_seqs, seq_size, emb_size)
Note: currently when using full, all sequences need to be of the same length.