premAI-io / state-of-open-source-ai

:closed_book: Clarity in the current fast-paced mess of Open Source innovation
https://book.premai.io/state-of-open-source-ai
Other
1.52k stars 89 forks source link

state-of-open-source-ai/models/ #94

Open utterances-bot opened 11 months ago

utterances-bot commented 11 months ago

Models — State of Open Source AI Book

https://book.premai.io/state-of-open-source-ai/models/

flaxsearch commented 11 months ago

Llama 2 is not open source, the Meta licence isn't OSI-approved. Sadly Meta keep saying it is open source and people keep believing them.

biswaroop1547 commented 11 months ago

@flaxsearch True, thanks for pointing out! hence we also mentioned -

All model variants under LLaMA-2 are released under LLaMA-2 License, permitting commercial usage unless it’s facing 700 million monthly active users then the entity must obtain a license from Meta.

casperdcl commented 11 months ago

See also Meaning of "Open" - I agree it's deliberately confusing. Open source weights doesn't have to mean open source training data or permissive/OSI-approved licence terms.

flaxsearch commented 11 months ago

Perhaps you should retitle the section 'Open Source Models' as 'Open Models' and then link to the section on Meaning of Open just below the title? I agree it's confusing, I wrote https://opensourceconnections.com/blog/2023/07/19/is-llama-2-open-source-no-and-perhaps-we-need-a-new-definition-of-open/ in an attempt to help clarify the situation

casperdcl commented 11 months ago

Good idea; added a link to Meaning of "Open" in #97

Also note that OSI's "open source definition" (OSD) is mentioned in the link above, but I completely disagree with it. OSD states that "open source" in their opinion should also imply "open licence", and it focuses almost exclusively on licences rather than source code. This is wrong. Source code and licences are two independent, well-defined concepts and do not at all need to imply each other. I believe OSD is the biggest contributor to confusion, and I would strongly argue that OSD should be renamed "open licence definition".

For me a more interesting point is "can you really call a model open source if only the weights but not the training data are available?"