potsawee / selfcheckgpt

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
MIT License
467 stars 54 forks source link

Feedback: Adding Notebook for R and S generation by LLM #25

Closed Kirushikesh closed 5 months ago

Kirushikesh commented 9 months ago

Really a great work. But one suggestion, as the current demo notebook assess an Hallucination of response given a set of samples by the user. But the actually work of the paper if i understood correctly, to assess whether the response from an LLM is hallucinated or not. I think it would be helpful to have a notebook/module where given an LLM with query the same method presented in the paper, first the R(response) is found with temperature=0 and N samples generated with the same LLM with temperature=1 and sampling technique. Then assess the factuality of the R with respect to {S1, S2, ... SN}.

My Idea includes allowing the user to give query and the Huggingface LLM as the input and the system does the whole work behind the frame and gives the Response R with the hallucination score wrt a method.

potsawee commented 6 months ago

Hi @Kirushikesh

Thank you for the suggestion, and I'm sorry for my late response.

Yes, the experiments reported in the paper show the results of assessing responses from an LLM. However, this repository is implemented for general usages of fact-checking (i.e., checking response $R$ given some evidence $S_1, S_2, ..., S_N$) so I didn't include a notebook specifically for this setup.

Your suggestion sounds good to me, and if you are interested in contributing to this repository -- a PR is very welcome!

Best wishes Potsawee

Kirushikesh commented 5 months ago

@potsawee i have raised the PR as per your suggestion. Please review it.

potsawee commented 5 months ago

@Kirushikesh Thank you for contributing the notebook. The PR has been merged. Thanks!