prakharguptaz / Instructdial

Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
Apache License 2.0
96 stars 13 forks source link

Newlines in prompts on HuggingFace inference #12

Closed trebedea closed 1 year ago

trebedea commented 1 year ago

Hi,

I am using the DIAL-BART0 model on HuggingFace inference API for intent detection.

I have tried the suggested prompt as follows, sent as a json file: "{"inputs": "Instruction: What is the intent of the response\n\nInput: [CONTEXT] [RESPONSE] please move the car [ENDOFDIALOGUE] [OPTIONS] move car, change speed [QUESTION] The intent of the response is"}"

and the service returns: "[{"generated_text":"Please move car, change speed"}]"

When changing "\n" to "\r\n", everything is OK and get the expected output. I have tried several test cases and the model consistenly has better performance with "\r\n" instead of "\n". Has the model been trained with Windows style newlines?

It is really strange this bevaiour, took some time to figure it out. The results are the same if I use a Windows or a Linux machine to call the HF endpoint.

Thanks, Traian

prakharguptaz commented 1 year ago

I'm sorry for not getting back to you sooner. Please note that for classification options, you need to use the |||| token as the separator. In this example you need to input "...[OPTIONS] move car||||change speed [QUESTION]..." I tried this with both types of line endings and it works for both.

The line endings used after the "Instructions..." are indeed windows style in most tasks. The model is sensitive to formatting (as evident from the separator token issue), and even incorrect spaces can lead to poor results.

In any case, thanks for the interesting observation and for bringing this to my notice! Please let me know if you have any questions