bloomz Search Results - Githubissues

389 results
for bloomz

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

c2siorg/Project-Explainer #4

experiment: perform summarisation experiments for baseline l…

Perform summarisation (abstractive) using baseline models (encoder decoder and decoder only) from hugging face and then note the learnings and behaviour. Different techniques : 1. 0 shot learning…

sripravallikab updated 7 months ago
3
gylertaydos/Elder-Scrolls-Dovahzul #1

Dragon Training!

@OH-ThatGuy @gylertaydos Part of your milestone for Monday: Let's look up some resources: * passages of dragon (with or without English translation) * resources on language models that train for la…

ebeshero updated 6 months ago
1
andyzoujm/representation-engineering #32

Performance of Harmfulness exp is too high ?

Hello authors, Your experiment results on harmfulness classification:`https://github.com/andyzoujm/representation-engineering/blob/main/examples/harmless_harmful/harmless_llama2.ipynb` shows that Lla…

shaoyangxu updated 1 month ago
1
huggingface/transformers-bloom-inference #63

stuck when inferring

I run this script `deepspeed --num_gpus 1 bloom-inference-scripts/bloom-ds-inference.py --name bigscience/bloomz-7b1 --batch_size 8` and it gets stuck just like in the picture. Log: ``` (ba…

raihan0824 updated 1 year ago
1
tatsu-lab/stanford_alpaca #111

Problem with finetuning bloom

What is the `fsdp_transformer_layer_cls_to_wrap` for bloom? When I tried to fine tune with bloomz-7b1, the training stuck on 0%. As you said in the readme, it's most likely because I dont set the r…

raihan0824 updated 1 year ago
19
hpcaitech/ColossalAI #3342

[BUG]: chat sft model train

### 🐛 Describe the bug 训练中显存不稳定，一直在慢慢增长 tesla V100 bloomz-1b1 32G显存会爆，虽然开始能跑，跑久了就不行了 ### Environment _No response_

Cloopen-ReLiNK updated 1 year ago
9
FMInference/FlexiGen #101

Is FlexGen+GPTQ 4bit possible?

Just a curious question I suppose! GPTQ 4bit - https://github.com/qwopqwop200/GPTQ-for-LLaMa Suppose someone eventually finetunes 175B OPT model, with loras or regular finetunng. or perhaps the BLOO…

BarfingLemurs updated 1 year ago
1
huggingface/text-generation-inference #195

Could not run 'c10d::allreduce_' with arguments from the 'Me…

I try to start a large version of the model using docker: `docker run -p 10249:80 -e RUST_BACKTRACE=full -e FLASH_ATTENTION=1 -e CUDA_VISIBLE_DEVICES=4,7 --privileged --security-opt="seccomp=unconf…

mathxin updated 1 year ago
6
princeton-nlp/MeZO #4

MeZo can be used in NLG tasks?

Can MeZo be used on NLG tasks? I integrated the _inner_training_loop part of the code and the methods it relies on into the NLG task training code, and performed fine-tuning training on bloom (bloomz-…

anonNo2 updated 1 year ago
5
NouamaneTazi/bloomz.cpp #12

document sampling parameters and/or minimal "viable" codegen…

hello. your work is great :+1: I wrapped your binary under my bot/API project https://github.com/laurentperez/ava#what-models-or-apis-does-it-support- I'm mostly interested in code (python) gen…

laurentperez updated 1 year ago
1

上一页 1...1 2 3 4 5 6 7...39 下一页

389 results for bloomz

389 results
for bloomz