Junk output - Githubissues

Hi I get wierd output when I try to invoke model.generate using inference script ,but same prompt gives an expected output when chat demo is used and also it takes too long to infer with single gpu loaded in 8 bit

e);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r\u200e);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r\xa0\xa0)\r\xa0\xa0\xa0\xa0AGE donner);\r);\rAGE donner);\r)\r);\r);\r)\rAGE donner);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r cuenta Bild [_);\rAGE donner);\r);\r);\r);\r);\r.~);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r cuenta Bild [_);\r);\r);\r)

model = AutoModelForCausalLM.from_pretrained(
            'WizardLM/WizardLM-7B-V1.0',
            load_in_8bit=True,
            torch_dtype=torch.float16,
            #device_map="auto",
        )

_output = evaluate(data, tokenizer, model)
final_output = _output[0].split("### Response:")[1].strip()
final_output

@nlpxucan @RobertMarton @ChiYeungLaw

Hi, I'm facing the same issue. I'm trying the src/inference_wizardlm.py inference code with "WizardLM/WizardLM-7B-V1.0" as the base model and "data/WizardLM_testset.jsonl" the input.

{"id": 1, "instruction": "If a car travels 120 miles in 2 hours, what is its average speed in miles per hour?", "wizardlm": "();anon =\"\u2191\u2190\u2190\u200e\u2190\u2190@@ javascriptjl bere################\u2190\u200e\u2190\u2190\u2190\u2190\u200e\u200e\ufeff\ufeff\u200e\u200e\u2190\u2190\u2190\u2190\u200e\u200e\u200e\u200e\u200e\u2190\u2190\u2190\u2190\u2190\u2190\u2190\u2190\u200e\u2190\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\u200e\ufffd\u200e\u2191\u2190\u2191\u2190\u2190\u2190\u2190\rRoteqref);\r);\r\u200e\u200e);\r\u200e\r\r\r\u200e\ufeff\r\r\r\r\r################\r################////////////////\r\r################////################\ufeff################################################\ufeff################\ufeff\ufeff\ufeff\ufeff\u2190\ufeff\ufeff////\u2190\u2190\ufffd\u2190\ufffd\ufffd\ufffd\u2190\ufffd\ufffd\ufffd\ufffd\u2190\ufeff\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd);\r);\r);\r);\r);\r\ufffd\ufffd\ufffd);\r\ufffd\ufffd\ufffd\ufffd);\r);\r);\r////\ufffd);\r\ufffd);\r\ufffd);\r////\u200e################////\u2190\u2191////////////////////);\r;\r////////////////////////////////////////////////////////////////////\u2190\u2190\u0000\u0000////////\u0001////////////////////////////////////////\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\u0001\ufffd\ufffd\u0001\u0001\u0001\u0001\u0001\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffdraph

@nlpxucan @RobertMarton @ChiYeungLaw

Hi I get wierd output when I try to invoke model.generate using inference script ,but same prompt gives an expected output when chat demo is used and also it takes too long to infer with single gpu loaded in 8 bit

e);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r\u200e);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r\xa0\xa0)\r\xa0\xa0\xa0\xa0AGE donner);\r);\rAGE donner);\r)\r);\r);\r)\rAGE donner);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r cuenta Bild [_);\rAGE donner);\r);\r);\r);\r);\r.~);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r);\r cuenta Bild [_);\r);\r);\r)
model = AutoModelForCausalLM.from_pretrained(
            'WizardLM/WizardLM-7B-V1.0',
            load_in_8bit=True,
            torch_dtype=torch.float16,
            #device_map="auto",
        )
_output = evaluate(data, tokenizer, model)
final_output = _output[0].split("### Response:")[1].strip()
final_output
@nlpxucan @RobertMarton @ChiYeungLaw

Hi, I'm facing the same problem, were you able to figure out the issue?

Same problem here.

I used the instruction: (from the official dataset) What is the most comprehensive and efficient approach in Python to redact personal identifying information from a text string that contains a full name, street address, town, state, and zip code, specifically targeting the individual known as John Doe with a street address of 1234 Main Street in Anytown, XY 12222? Currently, I am using the following code:\nimport re\ntext = \"The street address associated with the individual known as John Doe is 1234 Main Street, located in the town of Anytown, in the state of XY, with a zip code of 12222.\"\nredacted_text = re.sub(r\"John Doe|\d+ Main Street|Anytown, XY \d+\", \"[REDACTED]\", text)\nIs there a more complex Python code that can ensure complete confidentiality?

got response: "simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5simp\u22c5FileName ();anon =\"FileName ();anon =\"simp\u22c5simp\u22c5FileName ();anon =\"simp\u22c5FileName ();anon =\"FileName ();anon =\"FileName ();anon =\"FileName ();anon =\"FileName ();anon =\"FileName ();anon =\"FileName ();anon =\"FileName ();anon =\"FileName ();anon =\");\r);\r);\rFileName ();anon =\"AGE donner);\r);\r);\r);\r);\rAGE donner);\rAGE donnerAGE donner);\r);\r);\r);\r);\rAGE donnerAGE donnerFileName ();anon =\"FileName ();anon =\"FileName ();anon =\"simp\u22c5AGE....(too long) "

This happens to most of my instructions. But after I finetune it for 2 epochs on a sub-set of the official instruction set, it can properly respond to most instructions. I have used the official template of WizardLM 7B for both inference and finetuning, and I didn't use 8bit for inference.

Maybe the current 7B model on the huggingface is the wrong version?

nlpxucan / WizardLM

Junk output #147