nichtdax / awesome-totally-open-chatgpt

A list of totally open alternatives to ChatGPT
Creative Commons Zero v1.0 Universal
4.51k stars 195 forks source link

add clue-ai/ChatYuan #18

Closed alreadydone closed 1 year ago

alreadydone commented 1 year ago

https://github.com/clue-ai/ChatYuan

I opened an issue there but got no response. Maybe someone here will volunteer to add it.

nichtdax commented 1 year ago

Can you provide the link to some kind of Readme in English?

alreadydone commented 1 year ago

The official readme contains English:

ChatYuan large v2 is an open-source large language model for dialogue, supports both Chinese and English languages, and in ChatGPT style. Based on the original functions of Chatyuan-large-v1, we optimized the model as follows: -Added the ability to speak in both Chinese and English. -Added the ability to refuse to answer. Learn to refuse to answer some dangerous and harmful questions. -Added code generation functionality. Basic code generation has been optimized to a certain extent. -Enhanced basic capabilities. The original contextual Q&A and creative writing skills have significantly improved. -Added a table generation function. Make the generated table content and format more appropriate. -Enhanced basic mathematical computing capabilities. -The maximum number of length tokens has been expanded to 4096. -Enhanced ability to simulate scenarios

But there are some parts in the Chinese readme that isn't translated:

It uses the same technical approach as the v1 version and has been optimized in aspects such as fine-tuning data, RLHF, and chain-of-thoughts.

ChatYuan-large-v2 is a lightweight model in the ChatYuan series that achieves high quality. Inference can be perform on consumer-level graphics cards, PCs, or even mobile phones (only 400M for INT4).

online Demo(Huggingface) | online Demo(ModelScope, in Chinese) | try online in Colab

The base model is PromptCLUE (770M parameters, download) pre-trained on a 100B-token Chinese corpus, which has also undergone Prompt-task-based training on hundreds of tasks. It was then trained on 100M functional dialogue / multi-turn dialogue data to obtain the ChatYuan model. ClueAI's online service is based on proprietary models of dozens of billions of parameters, and they have plans to open-source larger models, according to this article (in Chinese).

nichtdax commented 1 year ago

But there are some parts in the Chinese readme that isn't translated:

I would prefer a fully English readme with example. But do you think we should just add your comment in Related Links (if this project is added to the list)

alreadydone commented 1 year ago

Indeed it seems they are too focused on Chinese users (their company is centered around the CLUE benchmark as you can see from its name ClueAI, where C stands for Chinese), so that all examples provided are in Chinese. Maybe ChatYuan is not friendly enough to English users, and maybe it's not a good fit for this repo for this reason, but if you decide to add it, feel free to link to my comment above.

nichtdax commented 1 year ago

Resolved by 50ee7e28fc5fc309eabd3d9b156effb11459cb9d