muditbhargava66 / PyxLSTM

Efficient Python library for Extended LSTM with exponential gating, memory mixing, and matrix memory for superior sequence modeling.
https://pyxlstm.readthedocs.io/
MIT License
193 stars 17 forks source link
language-modeling lstm sequence-modeling xlstm

PyxLSTM

Banner

Python Version GitHub license Documentation Status PRs Welcome CodeQL GitHub stars GitHub forks GitHub Releases Last Commit Open Issues Open PRs

PyxLSTM is a Python library that provides an efficient and extensible implementation of the Extended Long Short-Term Memory (xLSTM) architecture based on the research paper "xLSTM: Extended Long Short-Term Memory" by Beck et al. (2024). xLSTM enhances the traditional LSTM by introducing exponential gating, memory mixing, and a matrix memory structure, enabling improved performance and scalability for sequence modeling tasks.

Table of Contents

Features

Installation

To install PyxLSTM, you can use pip:

pip install PyxLSTM

Development Installation

For development installation with testing dependencies:

pip install PyxLSTM[dev]

Alternatively, you can clone the repository and install it manually:

git clone https://github.com/muditbhargava66/PyxLSTM.git
cd PyxLSTM
pip install -r requirements.txt
pip install -e .

Usage

Here's a basic example of how to use PyxLSTM for language modeling:

import torch
from xLSTM.model import xLSTM
from xLSTM.data import LanguageModelingDataset, Tokenizer
from xLSTM.utils import load_config, set_seed, get_device
from xLSTM.training import train  # Assuming train function is defined in training module

# Load configuration
config = load_config("path/to/config.yaml")
set_seed(config.seed)
device = get_device()

# Initialize tokenizer and dataset
tokenizer = Tokenizer(config.vocab_file)
train_dataset = LanguageModelingDataset(config.train_data, tokenizer, config.max_length)

# Create xLSTM model
model = xLSTM(len(tokenizer), config.embedding_size, config.hidden_size,
              config.num_layers, config.num_blocks, config.dropout,
              config.bidirectional, config.lstm_type)
model.to(device)

# Train the model
optimizer = torch.optim.Adam(model.parameters(), lr=config.learning_rate)
criterion = torch.nn.CrossEntropyLoss(ignore_index=tokenizer.pad_token_id)
train(model, train_dataset, optimizer, criterion, config, device)

For more detailed usage instructions and examples, please refer to the documentation.

Code Directory Structure

xLSTM/
│
├── xLSTM/
│   ├── __init__.py
│   ├── slstm.py
│   ├── mlstm.py
│   ├── block.py
│   └── model.py
│
├── utils/
│   ├── config.py
│   ├── logging.py
│   └── utils.py
│
├── tests/
│   ├── test_slstm.py  
│   ├── test_mlstm.py
│   ├── test_block.py
│   └── test_model.py
│
├── docs/
│   ├── slstm.md
│   ├── mlstm.md
│   └── training.md
│
├── examples/
│   ├── language_modeling.py
│   └── xLSTM_shape_verification.py
│
├── .gitignore
├── pyproject.toml
├── MANIFEST.in
├── requirements.txt
├── README.md
└── LICENSE

Running and Testing the Codebase

To run and test the PyxLSTM codebase, follow these steps:

  1. Clone the PyxLSTM repository:

    git clone https://github.com/muditbhargava66/PyxLSTM.git
  2. Navigate to the cloned directory:

    cd PyxLSTM
  3. Install the required dependencies:

    pip install -r requirements.txt
  4. Run the unit tests:

    python -m unittest discover tests

    This command will run all the unit tests located in the tests directory. It will execute the test files test_slstm.py, test_mlstm.py, test_block.py, and test_model.py.

If you encounter any issues or have further questions, please refer to the PyxLSTM documentation or reach out to the maintainers for assistance.

Documentation

The documentation for PyxLSTM can be found in the docs directory. It provides detailed information about the library's components, usage guidelines, and examples.

Citation

If you use PyxLSTM in your research or projects, please cite the original xLSTM paper:

@article{Beck2024xLSTM,
  title={xLSTM: Extended Long Short-Term Memory},
  author={Beck, Maximilian and Pöppel, Korbinian and Spanring, Markus and Auer, Andreas and Prudnikova, Oleksandra and Kopp, Michael and Klambauer, Günter and Brandstetter, Johannes and Hochreiter, Sepp},
  journal={arXiv preprint arXiv:2405.04517},
  year={2024}
}

Paper link: https://arxiv.org/abs/2405.04517

Contributing

Contributions to PyxLSTM are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.

License

PyxLSTM is released under the MIT License. See the LICENSE file for more information.

Acknowledgements

We would like to acknowledge the original authors of the xLSTM architecture for their valuable research and contributions to the field of sequence modeling.

Contact

For any questions or inquiries, please contact the project maintainer:

We hope you find PyxLSTM useful for your sequence modeling projects!

Star History

Star History Chart

TODO