How to use huggingface pipeline + vicuna-13b-delta-v0?

tonyaw commented 1 year ago

I'm trying to use a CPU server to run vicuna-13b-delta-v0 via huggingface pipeline. Unfortunately, I got non-sense result. My code: `model_name = "./models/vicuna-13b-delta-v0"

import torch converse = pipeline("conversational", model=model_name)

conversation_1 = Conversation("Going to the movies tonight - any suggestions?") conversation_2 = Conversation("What's the last book you have read?") result = converse([conversation_1, conversation_2]) logger.info("result=%s", result)`

I got nonsense result: result=[Conversation id: 0fd4b846-ecb0-40df-804c-a036d5204303 user >> Going to the movies tonight - any suggestions? bot >> Auflage occasionally.¡ª [[ , Conversation id: 32f1e750-6021-4c7a-827f-3a8cf1bcff98 user >> What's the last book you have read? Raf > {\ gathered; ]

Could you please guide me what I missed? Thanks very much in advance!

tonyaw commented 1 year ago

One possibility is I'm using wrong version of Python library. Could you please help to provide the recommended Python library version? What I'm using: transformers-4.29.0.dev0.dist-info torch-2.0.0-py3.8.egg-info accelerate-0.18.0.dev0.dist-info

sgsdxzy commented 1 year ago

The delta weight cannot be used directly, you need to merge weights: https://github.com/lm-sys/FastChat#vicuna-13b And new delta weight v1.1 is out.

lm-sys / FastChat

How to use huggingface pipeline + vicuna-13b-delta-v0? #388