opening-up-chatgpt / opening-up-chatgpt.github.io

Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.
https://opening-up-chatgpt.github.io/
Apache License 2.0
80 stars 5 forks source link

Add Orca if it ever releases #47

Open timjzee opened 11 months ago

timjzee commented 11 months ago

Whitepaper: https://arxiv.org/pdf/2306.02707.pdf

Will be released here: https://aka.ms/orca-lm

Summary: https://www.youtube.com/watch?v=Dt_UNg7Mchg

mdingemanse commented 11 months ago

Interesting find. And more evidence of the growing importance of synthetic instruction tuning data.

mdingemanse commented 11 months ago

Looks like there may be some version of it here: https://huggingface.co/yhyhy3/med-orca-instruct-33b-GPTQ

timjzee commented 11 months ago

I think most of the "Orca" models on Huggingface are projects which used a similar approach to the one described in the Microsoft paper. AFAIK they are not actual Orca releases.

mdingemanse commented 7 months ago

Nah. There is a new preprint that says

We open-source Orca 2 to encourage further research on the development, evaluation, and alignment of smaller LMs.

But nothing is open-sourced; this is a Llama2 finetune where only the instruction-tuned (or what they call explanation-tuned) model weights are made available, but none of instruction/explanation datasets and none of the source code is made available.

Thanks Meta for thoroughly diluting the term open source and thanks Microsoft for further contributing to it.