How to use your trained model

fancy-chenyao commented 1 month ago

Hello, I am very interested in your work. Can you give me an example of using your trained model to achieve hallucination labeling?

Liqu1d-G commented 3 days ago

Thank you for your interest and apologize for the late reply.

We recommend you use the more advanced annotator ANAH-v2. We have provided a prompt for hallucination annotation and you can find it here. To perform hallucination annotation, you need to prepare three things: the question, the response to be annotated, and the corresponding reference document. Our annotation is done at sentence granularity, so you need to cut the responses into sentences and annotate them sentence by sentence.

Once you have this initial corpus ready, for each sentence, you need to construct a three-round dialogue (in most cases):

In the first round of dialogue, using _fact_checkprompt, the model will return whether there is a judgeable fact for this sentence. If not, the label is ‘no fact’, and if there is one, the process continues to the next round.
In the second round of dialogue, using _reference_checkprompt, the model will return some key reference point.
In the third round of dialogue, using _reference_checkprompt, the model will return the final hallucination label. Note that the references used here are key reference points from the previous round.

minstar commented 2 days ago

Thank you for sharing your wonderful work! I've also had a hard time to understand how to use it. Thus, I prepared one of my example instances to annotate the response.

question: Based on the context provided, what was the exchange rate of 1 Bahraini Dinar to Swiss Franc on June 10, 2021? response: On June 10, 2021, the exchange rate of 1 Bahraini Dinar (BHD) to Swiss Franc (CHF) was approximately [2.374] CHF. reference document: |Monday 14 June 2021||1 BHD = 2.3862 CHF|\\n|Sunday 13 June 2021||1 BHD = 2.3828 CHF|\\n|Saturday 12 June 2021||1 BHD = 2.3835 CHF|\\n|Friday 11 June 2021||1 BHD = 2.3815 CHF|\\n|Thursday 10 June 2021||1 BHD = 2.374 CHF|\\n|Wednesday 9 June 2021||1 BHD = 2.3759 CHF|\\n|Tuesday 8 June 2021||1 BHD = 2.3791 CHF|

How could it be applied to the format of the suggested template protocol with prompt and what could be the inference code on this example?

suggested template protocol
dict(role='user', begin='<|im_start|>user\n', end='<|im_end|>\n'),
dict(role='assistant', begin='<|im_start|>assistant\n', end='<|im_end|>\n'),

prompt
hall_type_prompt = f"You will act as a ’Hallucination’ annotator. I will provide you with a question, a partial answer to that question, and related reference points. You need to determine whether the provided answer contains any hallucinatory content and annotate the type of hallucination. \
’Hallucination’ refers to content that contradicts the reference points or is unsupported by them. \
## Judgment Criteria: \
1. No Hallucination: If the answer is completely consistent with the reference points and does not introduce any contradictory information, output: <No Hallucination>. \
2. Contradiction: If the answer clearly contradicts the reference points, output: <Contradictory>. \
3. Unverifiable: If the answer contains information not mentioned in the reference points and cannot be supported or verified by them, output: <Unverifiable>. \
## Task Process: \
1. Carefully read the question, which is as follows: {question} \
2. Carefully read the partial answer, which is as follows: {answer} \
3. Carefully read the reference points, which are as follows: {reference} \
4. Conduct the analysis: Based on the above judgment criteria, determine if the answer contains hallucinations and output the type of hallucination."

open-compass / ANAH

How to use your trained model #5