-
Could you provide some Training & Inference examples for RetNet?
-
https://github.com/Jamie-Stirling/RetNet/blob/2acf026fc8435635051149d9bef793cae7f3d7af/src/retention.py#L45
Q and K are put onto any device because they are model parameters, while D is created in …
-
Thanks for the well-written package! The RetNet's official implementation had several updates at https://github.com/microsoft/unilm/blob/master/retnet/README.md#changelog .
-
https://github.com/fkodom/yet-another-retnet/blob/ee3979c7535b9f79a3020cb098d6b97f143bcd22/yet_another_retnet/retention.py#L16
I think this line should be F.silu rather than F.relu.
Thanks for r…
-
The implementation of chunkwise retention paradigm on the [chunkwise-real](/Jamie-Stirling/RetNet/tree/chunkwise-real) branch gives different outputs to the other two paradigms.
It appears there ma…
-
In the code when `is_first_step` is `True` then activate_recurrent is set to `False` here:
https://github.com/microsoft/torchscale/blob/main/torchscale/architecture/retnet.py#L362
I was wonderin…
-
Hi there~,
Thanks for your great work RetNet. i have encountered a problem when I try to define "incremental_state".
Could you provide me some usage about it or explain more?
Thanks,
Best regards.
-
Hey,
Thank you for this great work!
An error occurred when I used the model to generate text
-
Hi @fkodom ,
Thank you so much for sharing this work with the research community.
I have one question please, I measure the throughput in the inference and it seems that the parallel method has …
-
I trained a model using train.py and got the checkpoint folder, how do I load this model for inference?