cli99 / llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference
Apache License 2.0
343 stars 40 forks source link

Fix memory estimation #25

Closed cli99 closed 4 months ago

cli99 commented 4 months ago

This PR improves the memory estimation by adding unsharding weights memory usage, prefetch memory, fixing bugs, etc.