Azure-Samples / azure-search-openai-demo

A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
https://azure.microsoft.com/products/search
MIT License
5.63k stars 3.77k forks source link

Prepdocs.py crashing with module error, not able to create/load the search index #1078

Open gw37 opened 6 months ago

gw37 commented 6 months ago

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting

Minimal steps to reproduce

Running prepdocs.ps1 into prepdocs.py from a newly cloned copy of the repo today

Any log messages given by the failure

Running "prepdocs.py" Traceback (most recent call last): File "C:\Files\repo1\scripts\prepdocs.py", line 10, in from prepdocslib.blobmanager import BlobManager File "C:\Files\repo1\scripts\prepdocslib\blobmanager.py", line 7, in import fitz # type: ignore ^^^^^^^^^^^ File "C:\Files\repo1\scripts.venv\Lib\site-packages\fitz__init__.py", line 22, in from fitz.fitz import * File "C:\Files\repo1\scripts.venv\Lib\site-packages\fitz\fitz.py", line 14, in from . import _fitz ImportError: DLL load failed while importing _fitz: The specified module could not be found.

Expected/desired behavior

Prepdocs would work

OS and Version?

Windows 11 22h2, also repro on Windows 10 22h2

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

azd version?

azd version 1.5.0 (commit 012ae734904e0c376ce5074605a6d0d3f05789ee)

run azd version and copy paste here.

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

pamelafox commented 6 months ago

What version of Python is this in?

gw37 commented 6 months ago

we see this in python 3.11.6 , 3.11.7

tonybaloney commented 6 months ago

Introduced in https://github.com/Azure-Samples/azure-search-openai-demo/pull/1056

fitz is missing from the prep docs dependency input, it might be a transient dependency that's unique to certain platforms.

Thanks for the report. I'll test this out on a Windows machine

tonybaloney commented 6 months ago

Update after trying to reproduce this. Fitz is part of pymupdf so it seems that package didn't install correctly.

This is a known issue with PyMuPDF on certain versions of Windows. They have a solution here https://github.com/pymupdf/PyMuPDF/blob/main/docs/installation.rst#problems-after-installation

gw37 commented 6 months ago

thanks i'll test this out and update here if any further issues