-
What is the `fsdp_transformer_layer_cls_to_wrap` for bloom?
When I tried to fine tune with bloomz-7b1, the training stuck on 0%. As you said in the readme, it's most likely because I dont set the r…
-
hello. your work is great :+1:
I wrapped your binary under my bot/API project https://github.com/laurentperez/ava#what-models-or-apis-does-it-support-
I'm mostly interested in code (python) gen…
-
https://arxiv.org/pdf/2212.09535.pdf
I was reading this paper, and really interested into trying this myself. But I can't find the model weights (bloom-3b) anywhere. Can you link that? would be great…
-
![98DDB13F-60AE-4F7D-8979-9B287A2A4CC1](https://user-images.githubusercontent.com/39515647/233412075-f68a9c2b-24c8-426c-80d3-6f2c0e48b1ca.png)
-
Thanks for your project. I have a few wishes: the most important thing is that the models cannot translate more than one sentence (after the dot it does not translate in most cases), the answers are c…
-
**LocalAI version:**
commit [3829aba](https://github.com/go-skynet/LocalAI/commit/3829aba869f8925dde7a1c9f280a4718dda3a18c)
**Environment, CPU architecture, OS, and Version:**
Darwin macmini …
-
I collect some chinese data about "中国云南" like this:
![0417-2](https://user-images.githubusercontent.com/52442277/232364095-2bf77e7b-f850-46ba-ae5f-5d9777404b1c.png)
And train follow the readme base …
-
-
is it possible to use petals for inferring/prompt tuning without sharing my gpu?
-
Hi, when I ran ppo with bloomz-7b1-mt and bloom-560m (prompt_len = answer_len = 256) with zero stage 3 (8*A100-40G), it seems the generation time is too slow (average about 72s). When I setting zero s…