rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
https://docs.rs/llm/latest/llm/
Apache License 2.0
6.08k stars 362 forks source link

Update gpt2 to use wte if no lm_head #362

Closed steventrouble closed 1 year ago

steventrouble commented 1 year ago

Closes #338

Based off of https://github.com/rustformers/llm/pull/343

Fixes the segfault issue mentioned in that bug, and adds a check that could help catch those errors faster.

philpax commented 1 year ago

Looks good! Can you add the comments from the original PR about why the tensor's optional and why we substitute with wte?

steventrouble commented 1 year ago

👍 done, thanks!

philpax commented 1 year ago

Brilliant, thanks 🚀