Open Famara72 opened 7 months ago
Preferably this would also support different lengths of the sequences, like testing on the sequences on the form x_1, y1 ... x{i-1}, y_{i-1}, x_i* for 0<i<n at the same time
This seems like it would be greatly helped by K-V caching, which should be provided by the models imported from huggingface's transformers
To speed up some of the evaluation code, it would be ideal if we were able to evaluate identical prefixes of the in context examples simultaneously, and therefore speed up computation. That is, i=if a context model was given multiple sequences on the form x_1, y_1 ... x_n, yn, x{n+1, j}, it would just need to process the initial data once.