Closed xorange closed 1 month ago
Hi!
I think the reason is that Qwen(1.0) uses custom modeling code and it is not fully compatible with transformers
, including the pipeline
.
Since Qwen(1.0) is no longer actively maintained, I would advise you to migrate to Qwen1.5, which should work out-of-the-box with transformers
.
In addition, Qwen models are decoder-only language models or casual language models. They are not trained specifically to extract features. If your task at hand is to obtain sentence, phrase, or word features, I would recommend using embedding models instead.
Thanks for reply !
I'm mainly learning and poking around here and try to analyse/opt about weight 'transformer.wte' and 'lm_head', that's why I'm playing around with embeddings.
Thanks again for your suggestions on Qwen1.5 and embedding models.
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
Using this snippet:
Yields:
期望行为 | Expected Behavior
I expect above result because the embedding of each token should have the length of hidden_space. While 151936 should be the length of all possible tokens for Qwen.
Am I wrong here and please do correct me if so. Thanks !
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response