learning-at-home / lean_transformer

Memory-efficient transformer. Work in progress.
MIT License
19 stars 3 forks source link

multihead attention version with loop by batch dimension to reduce memory usage #10

Open krunt opened 2 years ago

krunt commented 2 years ago

input param batch_step controls loop batch size (issue #7)

krunt commented 2 years ago

@justheuristic, pls look, updated review

justheuristic commented 2 years ago

[will merge today, sorry for the delay]