Closed FlyingPumba closed 1 week ago
There are some checks failing (Compatibility) in the PR, but they seem unrelated to these changes.
The failures do appear to be related to the changes unfortunately. Specifically, it has something to do with your typing. I would check your use of the pipe (|
) on different versions of python. On python 3.8, the support is basically non-existent. For 3.9, it's not quite as dramatic, but there are still certain uses that were not supported. We need to at least support 3.9, but I do not mind dropping support for 3.8. See if you can get it to run on 3.9, and I will take a look into dropping support for 3.8.
This was indeed my mistake. It didn't occur to me that the union typing using pipes (e.g., A | B
) would have limited support in Python 3.9. I have trivially changed them to use Union
(as the rest of the project does). The compatibility checks should now work for both Python 3.8 and 3.9
Great! A very important question for you before this gets merged. I see you are changing the typing for the shape of the tensor returned from accumulated_bias
. What was the reason for that? If people are writing code expecting the current shape returned from that function, then changing it could break their code, thus requiring a major release. I see that you didn't actually change the internals of the function. Is this a case of that return type being incorrect before you revised the test?
The typing was indeed incorrect for the accumulated_bias
method. You can see here that the initial shape for the returning tensor is d_model
, and then the method proceeds to just sum the biases, which are all d_model
size. Also, the "Returns" in the docstring for the method says that the output is a Tensor of shape [d_model]
. I found this incorrect type when I used accumulated_bias
in one of the new tests.
Description
I've added more tests for ActivationCache, increasing coverage from 64% to 95%. In the process I also fixed some minor bugs arising in border cases.
Type of change
Please delete options that are not relevant.
Screenshots
Checklist: