potsawee / selfcheckgpt

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
MIT License
467 stars 54 forks source link

Open Question: Fact Checking LLM #14

Closed GvdDool closed 1 year ago

GvdDool commented 1 year ago

Hello, On your repo, there is no discussion tab, so I am opening an issue as a question, and I hope that this is fine.

I found this package while looking into a method to check if the results I am getting from my LLM (google PaLM) are reliable. The questions I am asking are very specific, and I suspect many answers will be laced with hallucinations, which is understandable considering the training set and the narrow knowledge domain I am accessing with my question.

The questions I am asking are about educational institutes teaching a specific topic, by country, in pseudo-code: 1) Loop over all countries 2) By country, give a list of educational institutes teaching the specific topic 3) By educational institute, give the (degree) program in which the specific topic is taught 4) By program, give the modules part of this program

As you can see, this is a very specific line of questioning, and because we don't know if the model has documents specific on these parameters, I am looking for a method to verify that the answer to the third step is correct, because when false the list of modules is also false. Do you think your package can help me with this, or do you know there is another method more appropriate to check the results?

Thanks in advance for looking into this!

potsawee commented 1 year ago

Hi @GvdDool

The selfcheck approach estimates the uncertainty in generating outputs about a specific concept. As we experimented with the individuals in the wikibio dataset, the family of methods (e.g. selfcheck-qa, selfcheck-prompt, etc) appear to correlate well with fact-checking, i.e., when there is high self-consistency, it is more likely to be factual. Having said this, I'd expect it to also work in your specific scenario. You would suggest using selfcheck with LLM-prompting, you can control the llm-prompting (in evaluation) to be specific to what you're looking for.

Alternatively, if you have a database of information about the educational institutes, you could, for example, use a retrieval-based by comparing what the LLM generates against the reference retrieved from the database.

Potsawee

GvdDool commented 1 year ago

Thanks @potsawee , I will have a closer look at your code, but any leads on how you would set this up would be useful. Currently, I am thinking of implementing something like this:

The core questions yielding these results: Country: Kenya Educational institute: University of Nairobi Degree program: Master's degree in {topic}

The self-check prompt would be: Could you, please, verify for me that the University of Nairobi in Kenya is offering a Master's degree in {topic}?

potsawee commented 1 year ago

Hi @GvdDool

  1. The other variants of selfcheck (QA, BERTScore, n-gram, NLI) are implemented into the package in this repository.

  2. Selfcheck with prompt, however, has to be coded manually as it depends on which LLM you use for prompting. This should be a simple process though, and I provide my code for doing selfcheck-prompt (using OpenAI's GPT) in this folder: https://github.com/potsawee/selfcheckgpt/tree/main/demo/experiments/selfcheck_prompt. The code writes textual outputs to files. You need to process them into scores, e.g., Step1: Yes -> 0.0, No -> 1.0 per sample & Step2: averaging the scores across all samples.

  3. regarding the actual prompt, I've experimented with a simple prompt Context: {sample}\n\nSentence: {sentence_to_be_assessed}\n\nIs the sentence supported by the context above? Answer Yes or No:. I think what you provide above is missing {sample}. The idea of selfcheck is to generate (output) samples multiple times, and use each sample as the evidence to self-verify facts.

Hope this helps!

GvdDool commented 1 year ago

Thanks @potsawee - I am still checking different models, and self-check is high on my list of things I would like to include in this process.