Closed Mahran-xo closed 8 months ago
Hi @Mahran-xo, thanks for the bug report. I've been trying to reproduce this on a similar machine, but no luck so far. Is there any more output that gets printed after the segfault? I'm looking for the hex values in registers and some backtrace information that we print out in the case of a segmentation fault. How much RAM do you have available on this machine?
hello sorry for the late reply . i tried another model ( zoo:mpt-7b-mpt_chat_mpt_pretrain-base_quantized
) and it downloaded . but this time there's a different error. it says the following
2023-10-29 13:22:42 deepsparse.utils.onnx INFO Overwriting in-place the input shapes of the transformer model at /mnt/d/DMS_NLP/LangChain/LLAMA/local-model/deployment/model.onnx
DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.6.0.20231020 COMMUNITY | (9eb1e5d9) (release) (optimized) (system=avx2, binary=avx2)
2023-10-29 13:22:42.443931000 [E:onnxruntime:, inference_session.cc:1693 operator()] Exception during initialization: /home/centos/build/nyann/external/onnx-runtime/onnxruntime/core/optimizer/initializer.cc:43 onnxruntime::Initializer::Initializer(const onnx::TensorProto&, const onnxruntime::Path&) [ONNXRuntimeError] : 1 : FAIL : GetFileLength for /mnt/d/DMS_NLP/LangChain/LLAMA/local-model/deployment/model.data failed:Invalid fd was supplied: -1
[nm_ort 7f90fb961440 >ERROR< init src/libdeepsparse/ort_engine/ort_engine.cpp:538] std exception Exception during initialization: /home/centos/build/nyann/external/onnx-runtime/onnxruntime/core/optimizer/initializer.cc:43 onnxruntime::Initializer::Initializer(const onnx::TensorProto&, const onnxruntime::Path&) [ONNXRuntimeError] : 1 : FAIL : GetFileLength for /mnt/d/DMS_NLP/LangChain/LLAMA/local-model/deployment/model.data failed:Invalid fd was supplied: -1
Traceback (most recent call last):
File "/mnt/d/DMS_NLP/LangChain/LLAMA/sparse.py", line 5, in <module>
pipeline = TextGeneration(model=model_path)
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/pipeline.py", line 814, in text_generation_pipeline
return Pipeline.create("text_generation", *args, **kwargs)
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/base_pipeline.py", line 210, in create
return pipeline_constructor(**kwargs)
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/transformers/pipelines/text_generation.py", line 273, in __init__
self.engine, self.multitoken_engine = self.initialize_engines()
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/transformers/pipelines/text_generation.py", line 353, in initialize_engines
multitoken_engine = NLDecoderEngine(
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/transformers/engines/nl_decoder_engine.py", line 82, in __init__
self.engine = create_engine(
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/pipeline.py", line 759, in create_engine
return Engine(onnx_file_path, **engine_args)
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/engine.py", line 327, in __init__
self._eng_net = LIB.deepsparse_engine(
RuntimeError: NM: error: Exception during initialization: /home/centos/build/nyann/external/onnx-runtime/onnxruntime/core/optimizer/initializer.cc:43 onnxruntime::Initializer::Initializer(const onnx::TensorProto&, const onnxruntime::Path&) [ONNXRuntimeError] : 1 : FAIL : GetFileLength for /mnt/d/DMS_NLP/LangChain/LLAMA/local-model/deployment/model.data failed:Invalid fd was supplied: -1
the code i used to load this model :
from deepsparse import TextGeneration
# construct a pipeline
model_path = "./local-model/deployment"
pipeline = TextGeneration(model=model_path)
# generate text
prompt = "Below is an instruction that describes a task? ### Response:"
output = pipeline(prompt=prompt)
print(output.generations[0].text)
@Mahran-xo, regarding the segfault you ran into, are you on WSL1? If so I think that should be resolved in the latest nightly, 1.6.0.20231031
The second potentially looks like a missing model.data
-- that needs to be in the deployment directory as well.
thanks for the reply! , i followed your instructions and the error disappeared but this time i am getting this error
(linx) mahran@ali-tar:/mnt/d/DMS_NLP/LangChain/LLAMA$ /home/mahran/anaconda3/envs/linx/bin/python /mnt/d/DMS_NLP/LangChain/LLAMA/sparse.py
2023-10-31 23:59:23 deepsparse.transformers.pipelines.text_generation WARNING This ONNX graph does not support processing the promptwith processing length > 1
DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.6.0.20231031 COMMUNITY | (74098695) (release) (optimized) (system=avx2, binary=avx2)
[7f16d8570640 >WARN< operator() ./src/include/wand/utility/warnings.hpp:14] Generating emulated code for quantized (INT8) operations since no VNNI instructions were detected. Set NM_FAST_VNNI_EMULATION=1 to increase performance at the expense of accuracy.
Traceback (most recent call last):
File "/mnt/d/DMS_NLP/LangChain/LLAMA/sparse.py", line 9, in <module>
output = pipeline(prompt=prompt)
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/pipeline.py", line 238, in __call__
engine_inputs = self.process_inputs(pipeline_inputs)
File "/home/mahran/anaconda3/envs/linx/lib/python3.9/site-packages/deepsparse/transformers/pipelines/text_generation.py", line 472, in process_inputs
if not self.cache_support_enabled and generation_config.max_length > 1:
TypeError: '>' not supported between instances of 'NoneType' and 'int'
Hello @Mahran-xo May you try with our latest nightly to see if you can reproduce the new error you are having? Thank you for sharing! Jeannie / Neural Magic
Hello @Mahran-xo Happy New Year! As it's been some time without a response, we are going to go ahead and close out this issue. Please let us know if you have further details on this specific topic and re-open the thread; we're happy to help! Thank you!
Jeannie / Neural Magic
I am trying to run
but it gives an error saying:
2023-10-23 01:51:05 deepsparse.transformers.pipelines.text_generation INFO Compiling an auxiliary engine to process a prompt with a larger processing length. This improves performance, but may result in additional memory consumption. 2023-10-23 01:51:05 deepsparse.utils.onnx INFO Overwriting in-place the input shapes of the transformer model at /mnt/d/DMS_NLP/LangChain/LLAMA/mpt-7b-dolly_mpt_pretrain-pruned50_quantized/deployment/model.onnx DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.6.0.20231020 COMMUNITY | (9eb1e5d9) (release) (optimized) (system=avx2, binary=avx2) Segmentation fault (core dumped)
Environment
pip install -U deepsparse-nightly[llm]