jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models
MIT License
1.32k stars 115 forks source link

Best datasets to use for finetuning? #5

Open StrangeTcy opened 1 year ago

StrangeTcy commented 1 year ago

I suppose that'd depend on the specific RoPE variant to be used, but I wonder if you've conducted any experiments?