jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models
MIT License
1.25k stars 110 forks source link

Can it be debug with deepspeed + trainer #26

Closed cableyang closed 8 months ago

cableyang commented 9 months ago

1.accelerate need more configuration than deepspeed with trainer, can it be realised in the deepspeed mode 2.ecosystem more llm learners use Fastchat, can it be reproduce in https://github.com/lm-sys/FastChat 3.this postition embedding method need more open source developers to do more investigate

cableyang commented 8 months ago

no answer close