davmacario / MDI-LLM

Implementation of Model-Distributed Inference for Large Language Models, built on top of LitGPT
MIT License
3 stars 2 forks source link

Minor updates #14

Closed davmacario closed 7 months ago

davmacario commented 7 months ago

Just syncing the branches;

Added weight-tying to starter node model, as it now contains both the token embedding and the final linear layer (these 2 do the same thing, but in the opposite direction).

Added Readme disclaimer - this repo is WIP.