Closed ch1ld3r1c0 closed 7 months ago
The issue you are facing right now is not from the hugging face/transformers team and that's happening cause you are using model = model.half()
in CPU, and matmul op for CPU is not implemented in PyTorch
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Closing as @erfanzar's answer resolves this issue
System Info
transformers
version: 4.38.2Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
from spacy_llm.util import assemble nlp = assemble("spacy_few_shots.cfg")
text=""" example tex """
doc = nlp(text) for e in doc.ents: print(e.label_, e.text, e.start, e.end)
spacy_few_shots.cfg [nlp] lang = "en" pipeline = ["llm"]
[components]
[components.llm] factory = "llm"
[components.llm.task] @llm_tasks = "spacy.NER.v2" labels = label(example)
[components.llm.task.examples] @misc = "spacy.FewShotReader.v1" path = "spacy_few_shots.yml"
[components.llm.model] @llm_models = "spacy.OpenLLaMA.v1" name = "open_llama_3b"
Expected behavior
Model should generate output.