Open imoneoi opened 1 year ago
Look forward to it
I am also getting a bit apprehensive about that, and my fear is that it has been cancelled, as even this repo makes no mention of it even if it was a "coming soon!", and Meta has been silent about it since the paper came out, as they seem happy how the 13b (the mainstream budget option and a pretty good model, still) and 70b (which is inaccessible for most people) are being received.
Which is a shame, 34b would be the best option for AI researchers with single 24gb cards and people who run a small business who just wanted a "smart enough" model but cannot afford several industrial grade >40gb+ cards. If the LLaMA1 fine-tunes are anything to go by, a fine-tuned 34b would be just as good as 70b for most reasoning-related tasks while being half of its parameter count, especially considering how good the current 13b models are getting.
@Subarasheese I agree, but I guess their red team is working hard. As you know, their violation percentage is high, so I guess they are working on it.
@Subarasheese I agree, but I guess their red team is working hard. As you know, their violation percentage is high, so I guess they are working on it.
What I find weird is, that chart is in regards to 34b-chat, which is just a fine-tune over the base 34b model, and they did not even release the base model... If the base model itself is the problem, and not the dataset they used to fine-tune it into a chat model, I honestly doubt they will bother to retrain the whole thing. And if the dataset for fine-tune is the problem here, it makes me wonder why is it causing them so much trouble and taking that long (considering the models were not released when they wrote the paper)...
Perhaps they are looking into fixing some of LLama 2's repetition issues (as seen here: https://www.reddit.com/r/LocalLLaMA/comments/155vy0k/llama_2_too_repetitive/)
Perhaps thats why its taking a bit longer. But I'm willing to wait for quality models and am thankful for the great work Meta has done.
Do they realize their violation % being high is a good thing
Do they realize their violation % being high is a good thing
I don't think so, a SFW model have own use cases. And they've tested LaMA-2 34B Chat.
The foundational LLaMA-2 models aren't aligned, so you can fine-tune your own alignment for it. I wonder what's the issue: their fine-tune being meh for the model, or the model being reluctant to be fine-tuned. I am incline to believe it's the latter, because Meta could just release the LLaMA-2 34B foundational model without the Chat fine-tune otherwise.
Anf the "violation rate" of the chat model is on par with ChatGPT, which is a successful commercial service... So yeah, I think the issue isn't that, it's just a symptom. And they are working on some fix.
Hopefully. Would be a shame to skip 34B, it's a VERY useful size.
Considering codellama 34b was published, could we get an update on the foundational model?
@Subarasheese I agree, but I guess their red team is working hard. As you know, their violation percentage is high, so I guess they are working on it.
What does this have to do with the base model
Any update?
Any update?
@macarran We were really wondering if you have any plans to release 34B anytime soon. It would be very useful if you could let us know a rough timeline as well.
Any update? ^_^
I'll sign up for threads if you release this
Any update? :(
If there were an update I'm sure we would have heard.
Frankly at this point I wouldn't be surprised if any efforts to release LlaMa 2 34B have been redirected to working on LLaMa 3.
On Fri, Sep 15, 2023 at 9:37 AM One @.***> wrote:
Any update? :(
— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/llama/issues/590#issuecomment-1720404449, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACRZOUJGZP6CZJWWBUMSKI3X2O5QLANCNFSM6AAAAAA25IEMGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
If there were an update I'm sure we would have heard. Frankly at this point I wouldn't be surprised if any efforts to release LlaMa 2 34B have been redirected to working on LLaMa 3. … On Fri, Sep 15, 2023 at 9:37 AM One @.> wrote: Any update? :( — Reply to this email directly, view it on GitHub <#590 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACRZOUJGZP6CZJWWBUMSKI3X2O5QLANCNFSM6AAAAAA25IEMGU . You are receiving this because you are subscribed to this thread.Message ID: @.>
The government could ban opensource models by then, we cant wait until after 2024 begins!
The government could ban opensource models by then, we cant wait until after 2024 begins!
@teknium1 what do you mean? I'm shocked! Is it true?
Yes a status update or estimation would be great.