johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

July #132

Closed ghost closed 1 year ago

ghost commented 1 year ago

Hello @johnsmith0031

You've become a legend making this repo. I was wondering what you're up to these days, you know, as PanQiWei's AutoGPTQ and now Taprosoft's llm_finetuning have taken over the spotlight and how this repo's seen livelier days which will never be forgotten. But, whats happening now John? How are you? I know I and the other fans would love to know. Thank you!

johnsmith0031 commented 1 year ago

Thank you for your kind words! Currently I'm working exploring potential applications of llms in various domains (LangChain or something else like that), also trying to optimize the inference speed and vram usage of 4bit model (still in progress), may have some update on the repo if any progress is made. Hope the open source llms be more and more powerful.

ghost commented 1 year ago

Oh wow, all these text chaining agents and fine tuning the same llm on the good outputs, creating entire sets of token contextual processing tools just fascinates me, its truly awesome. Have you found anything in particular really interesting?

As for the inference and vram optimisations for GTPQ, have you talked to tuboderp or looked at his work? He considered adding training to exllama a while back but decided not to soon after, anyway, he just seems to be perfectly right down that ally. Using this repo, I feel like the speeds are perfectly fine and that in reality its the multigpu scaling that matters as anything that goes above consumer GPU VRAM is a trade between half as slow or pay the price hike for server grade cards, I have to give it to them though, very strategic. I'm not very smart at this though, what do you think?

It's a dream to get FOSS SOTA but I can't wait to see the direction the SOTA FOSS models go compared to the tech giant SOTA's in terms of AGI. Will FOSS catch up? Maybe slowly as it piggybacks off larger models? Or maybe data moats, too high the eye can see. It's strange to feel that we are here, right now, whilst the the world is deciding, between those that escaped trauma loving or greedy, if humanity will relic before the post-species of divine AI or coexistence on the biggest bond we have, that we're both here in this universe, clueless, looking for answers. Its one hell of a roller coaster ride ahead but I'm very grateful at how far the powers have let us get with with open source. Do you think these charity models like Falcon and OpenLlama and arguably LLama will pick up more hype from the people to continue what they're doing, keeping in mind that as their models get better, they become more of a asset?

johnsmith0031 commented 1 year ago

Not yet. Recently I found that for some complex tasks, even gpt-3.5 would fail to achieve the goal, let alone the open source models. And thanks for your information on exllama! I'll investigate into it in several days. Also, the progress made in the open-source community is commendable, and I am optimistic about the future developments in more powerful FOSS models because the capabilities we're seeing emerging from models now are just the tip of the iceberg.

ghost commented 1 year ago

Right on John. Thank you :)