Research on local documents doesn't work

assafelovic / gpt-researcher

LLM based autonomous agent that does online comprehensive research on any given topic

https://gptr.dev

Apache License 2.0

14.27k stars 1.86k forks source link

Research on local documents doesn't work #524

Closed yuxmi closed 2 months ago

yuxmi commented 4 months ago

I copied the code from the "Research on Local Documents" section, and all my local files in the DOC_PATH directory are plaintxt files. I keep getting the same error:

Traceback (most recent call last): File "/content/script.py", line 15, in report = asyncio.run(get_report(query=query, report_type=report_type, report_source=report_source)) File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run return loop.run_until_complete(main) File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete return future.result() File "/content/script.py", line 5, in get_report researcher = GPTResearcher(query=query, report_type=report_type, report_source=report_source) TypeError: GPTResearcher.init() got an unexpected keyword argument 'report_source'

assafelovic commented 4 months ago

Hey did you update to latest version pip install -U gpt-researcher

arsaboo commented 4 months ago

It is not working for me either. I am using docker and have added the full path to the .env.

@ElishaKay Do we need to mount the path in the container?

ElishaKay commented 4 months ago

@yuxmi, you might have an easier path via Docker Desktop:

Step 1: Install Docker Desktop

Step 2: Walk through this Tutorial

@arsaboo another requirement I should have mentioned: the path to your files should be within the code of this project. That will enable you to skip the complexity of mounting stuff into Docker

Are you working off the latest version of master? Please share the error you're getting

arsaboo commented 4 months ago

@ElishaKay I am not seeing any error, but it is not reading any local files. Here are the contents of my gpt-researcher folder (which has the my-docs folder) Here's how I have added it to my .env:

DOC_PATH=./my-docs

ElishaKay commented 4 months ago

@arsaboo what type of files are you trying to read? It looks like there was a bug with reading CSV's (required the pandas dependency)

I opened the PR for that here

arsaboo commented 4 months ago

I don't think I saw an error. It's just that the local files were not used. I used a pdf file

ElishaKay commented 4 months ago

@arsaboo ask chat.openai to help you debug the PDF Loader in document.py

Write in the prompt that you want GPT to generate a python file to debug the PDF Loader library in the above documents.py file

Then place the gpt-generated debugger python file within the gpt-researcher repo & run it from the command line. That should hopefully help you get closer to the root cause

amscosta commented 3 months ago

Hi It is working for me using my local DOC_PATH="/home/c0274/docs-to-gpt-researcher" and using the Quick Start instructions(steps 1,2,3).

However is not working using the following suggested snippet : from gpt_researcher import GPTResearcher import asyncio

async def get_report(query: str, report_type: str, report_source: str) -> str: researcher = GPTResearcher(query, report_type,report_source) research_result = await researcher.conduct_research() report = await researcher.write_report() return report

if name == "main": query = "what is the effect of the speed of lvad support?" report_type = "research_report" report_source = "local" report = asyncio.run(get_report(query, report_type,report_source)) print(report)

It keeps searching the web only. I tried : report_source = "local" report_source = "documents" report_source = "path/to/my/pdf" report_source = "path/to/my/pdf/my.pdf"

No success.

assafelovic commented 2 months ago

@amscosta did you try upgrading gpt-researcher to latest? pip install gpt-researcher -U

Im closing for now since this is stale but please feel free to open if the issue persists. Thanks!