PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
Fixed an issue with handling variable-length sampled sequences during inference in the MoDInfiniTransformer class. The forward and forward_ methods now properly pad sequences to the longest length and apply a padding mask to zero out padded positions.
Enhancements:
Restructured project directory to follow a more standard Python package layout.
Added __init__.py files to make infini_transformer, examples, and tests folders installable packages.
Updated import statements to match the new project structure.
Improved documentation in the README.md file.
Added support for additional activation functions in InfiniTransformer and MoDInfiniTransformer.
Implemented padding handling for variable-length sequences in MoDInfiniTransformer.
Created requirements.txt to list project dependencies.
Added MANIFEST.in to include necessary files in the package distribution.
Fixed test discovery issues by enforcing naming conventions.
Please review the changes and provide any feedback or suggestions. I've tested the modifications locally, but it would be great to have additional eyes on the code before merging.
Bug Fixes:
MoDInfiniTransformer
class. Theforward
andforward_
methods now properly pad sequences to the longest length and apply a padding mask to zero out padded positions.Enhancements:
__init__.py
files to makeinfini_transformer
,examples
, andtests
folders installable packages.README.md
file.InfiniTransformer
andMoDInfiniTransformer
.MoDInfiniTransformer
.requirements.txt
to list project dependencies.MANIFEST.in
to include necessary files in the package distribution.Please review the changes and provide any feedback or suggestions. I've tested the modifications locally, but it would be great to have additional eyes on the code before merging.
Thanks Mudit B.