NouamaneTazi / bloomz.cpp

C++ implementation for BLOOM
MIT License
812 stars 65 forks source link

several questions #26

Open wesleysanjose opened 1 year ago

wesleysanjose commented 1 year ago

i have only 16gb mem so i tried to use local-memory parameter, model loaded and i see converting started, but in the end it says killed still. i see a 20G model file generated. is it considered success?

also i was trying to convert the finetuned bloom model, (https://huggingface.co/BelleGroup/BELLE-7B-2M/tree/main). it was finetuned on 7B but looks like it was fp32 instead of fp16 so it's double sized. do i need to supply any additional param when trying to convert it to ggml? reason is after the conversion, the result becomes non-sense and weird chars.

or should i use their gptq 8bit quantized model to convert?

wesleysanjose commented 1 year ago

ransformer.h.21.input_layernorm.bias -> layers.21.attention_norm.bias layers.21.attention_norm.bias 1 (4096,) transformer.h.21.self_attention.query_key_value.weight -> layers.21.attention.query_key_value.weight Killed