h2oai / h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/
http://h2o.ai
Apache License 2.0
10.95k stars 1.2k forks source link

Add source infos: text chunk and its page number, possibility to hide document link #598

Open PsychicBirdy opened 11 months ago

PsychicBirdy commented 11 months ago

Hi, congrats to such a well-conceived app. I have a feature request. Currently when using sources, h2oGPT adds to its reply a score and a link to each document used. What I would like to see added here:

pseudotensor commented 11 months ago

Sounds like a great idea. Right now I only provide link to source document, but a link to chunks and meta data would be useful I agree.

PsychicBirdy commented 11 months ago

Thanks a lot for looking at it! Being able to access those snippets would also make replies much more transparent and comprehensible, the one thing basic chatbot replies are currently missing.

pseudotensor commented 10 months ago

It's maximally hidden in accordion, and has the actual snippets now. I'll add other metadata soon.

image

image

image

PsychicBirdy commented 10 months ago

I tested it and it looks good. Thanks a lot for implementing this! The accordion is a good idea so it doesn't get too cluttered. Now I'm only missing the page number and a switch to deactivate the linking to the document.

pseudotensor commented 10 months ago

FYI @PsychicBirdy you can see full meta data from database here, although not in references yet:

image

pseudotensor commented 10 months ago

@PsychicBirdy

I added the switch to show url or not. Before:

image

with --show_link_in_sources=False

image

PsychicBirdy commented 10 months ago

Looks good. However, I thought of the switch in a different way: When set to False, I would keep the path and filename - without them the other source information is kind of vague - and only remove the linking to the files. So for example "user_path_many/HAI_AI-Index_Report2023(2).pdf" is still shown, but you cannot click on it anymore, disabling the possibility to download files.