Closed bil-ash closed 1 year ago
Hi 👋 The link you sent seems to be invalid (I assume you meant https://huggingface.co/bigscience/bloomz-7b1 ?). Also, could you sent the syntax error you are getting?
At the moment, Bloomz is not supported (since it will be impossible to run in the browser) and also because I don't think the quantization will produce a model below 2GB (see here for more information). Perhaps this will be fixed once they release the 64-bit version, but it currently isn't possible (in the 32-bit version).
That said, you might be able to get an unquantized model working. The corresponding command should look something like:
python -m scripts.convert --model_id bigscience/bloomz-7b1 --from_hub --task causal-lm-with-past
Thanks, that worked. But it seems that models in onnx format are much larger than those converted using https://github.com/OpenNMT/CTranslate2 . So for time being I will use ctranslate2 for inference instead of trnasformers.js
While converting the bloomz model, I am getting the 'invalid syntax' error. Is conversion limited to only predefined model types? If not, please provide the syntax for converting the above model with quantization.
(I will run the inference in nodejs and not in browser, so memory will not be an issue in inference.)