BBC-Esq / VectorDB-Plugin-for-LM-Studio

Plugin that lets you ask questions about your documents including audio and video files.
https://www.youtube.com/@AI_For_Lawyers
291 stars 36 forks source link

After ingest : TypeError: Docx2txtLoader.__init__() got an unexpected keyword argument 'mode' #89

Closed Epixtome closed 10 months ago

Epixtome commented 10 months ago

I was using ChromaDB and LMStudio successfully three weeks ago. LM updated and Chroma broke. Reinstalled Chroma and it runs but I can no longer ingest. Here is the log:

create_database.py: Loading documents. document_processor.py: Number of workers assigned: 3 QPainter::end: Painter ended with 4 saved states concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "C:\Users\Chris\miniconda3\Lib\concurrent\futures\process.py", line 256, in _process_worker r = call_item.fn(*call_item.args, *call_item.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Chris\OneDrive\Bureau\ChromaDB-Plugin-for-LM-Studio-main (1)\ChromaDB-Plugin-for-LM-Studio-main\src\document_processor.py", line 100, in load_document_batch data_list = [future.result() for future in futures] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Chris\OneDrive\Bureau\ChromaDB-Plugin-for-LM-Studio-main (1)\ChromaDB-Plugin-for-LM-Studio-main\src\document_processor.py", line 100, in data_list = [future.result() for future in futures] ^^^^^^^^^^^^^^^ File "C:\Users\Chris\miniconda3\Lib\concurrent\futures_base.py", line 449, in result return self.get_result() ^^^^^^^^^^^^^^^^^^^ File "C:\Users\Chris\miniconda3\Lib\concurrent\futures_base.py", line 401, in get_result raise self._exception File "C:\Users\Chris\miniconda3\Lib\concurrent\futures\thread.py", line 58, in run result = self.fn(self.args, self.kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Chris\OneDrive\Bureau\ChromaDB-Plugin-for-LM-Studio-main (1)\ChromaDB-Plugin-for-LM-Studio-main\src\document_processor.py", line 70, in load_single_document loader = Docx2txtLoader(str(file_path), mode="single", strategy="fast") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: Docx2txtLoader.init() got an unexpected keyword argument 'mode' """**

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "C:\Users\Chris\OneDrive\Bureau\ChromaDB-Plugin-for-LM-Studio-main (1)\ChromaDB-Plugin-for-LM-Studio-main\src\gui_threads.py", line 7, in run create_database.main() File "C:\Users\Chris\OneDrive\Bureau\ChromaDB-Plugin-for-LM-Studio-main (1)\ChromaDB-Plugin-for-LM-Studio-main\src\create_database.py", line 46, in main documents = load_documents(SOURCE_DIRECTORY) # invoke document_processor.py; returns a list of document objects ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Chris\OneDrive\Bureau\ChromaDB-Plugin-for-LM-Studio-main (1)\ChromaDB-Plugin-for-LM-Studio-main\src\document_processor.py", line 120, in loaddocuments contents, = future.result() ^^^^^^^^^^^^^^^ File "C:\Users\Chris\miniconda3\Lib\concurrent\futures_base.py", line 449, in result return self.get_result() ^^^^^^^^^^^^^^^^^^^ File "C:\Users\Chris\miniconda3\Lib\concurrent\futures_base.py", line 401, in get_result raise self._exception TypeError: Docx2txtLoader.init() got an unexpected keyword argument 'mode'

Epixtome commented 10 months ago

I opened DocumentProcessor.py in notepad, went to line 70 and removed both the mode and strategy. It ingested the document, and is providing answers as expected. Not sure what other issues this could have caused, but I've "fixed" it for the moment.

BBC-Esq commented 10 months ago

I forgot to change it to "UnstructuredWordDocumentLoader". This needs to be done in constants.py, document_processor.py, and I think that's it, but there might be one other place.