Open shivam10u opened 7 months ago
@pamelafox Please Help?
Have you looked through the suggestions in the related threads? https://github.com/Azure-Samples/azure-search-openai-demo/issues/343 It's generally a Python environment issue, and there are some links/ideas there. If all else fails, you could use GitHub Codespaces for a clean Python environment.
Hi @pamelafox , I tried #343 the issue in this thread works fine for me!
But I am facing issues while manually adding additional documents! Is there any other way we can split and index documents from azure portal itself! Earlier I was using python 3.12 and I downgraded to 3.10 as per your suggestion and still have the same error when running ./scripts/prepdocs.ps1 I even tried troubleshooting from here as well but no luck- https://stackoverflow.com/questions/34370962/no-module-named-cffi-backend
You could try out the new integrated vectorization feature! That's all cloud-based indexing, using indexers and skills for chunking and vectorizing. Here's a PR: https://github.com/Azure-Samples/azure-search-openai-demo/pull/1159
You could check out that branch, follow the README steps in it, and re-run "azd up".
Hello @pamelafox,
I appreciate your prompt responses. I am currently working on implementing dynamic data ingestion within Azure. Specifically, I am planning to create an Azure Timer Trigger function that uploads PDF files to Blob Storage. Subsequently, a Blob Trigger Azure Function will be employed to execute the required scripts, such as prepdocs.py or prepdocs.ps1. These scripts will handle tasks such as splitting and indexing, eliminating the need for local file uploads and manual execution of commands.
The goal is to streamline the process, ensuring that every time a PDF is uploaded to the Blob, the associated script is automatically executed, providing the latest data responses in the Blob storage.
I have attempted to explore the provided repository, but unfortunately, the README.md file lacks detailed steps. It primarily contains information about changes made without explicit instructions on how to set up the environment.
Moreover, when attempting to check out the code, I encountered the following error:
Microsoft Windows [Version 10.0.22000.2713] (c) Microsoft Corporation. All rights reserved.
C:\Test Dynamic>git clone https://github.com/Azure-Samples/azure-search-openai-demo.git Cloning into 'azure-search-openai-demo'... remote: Enumerating objects: 4234, done. remote: Counting objects: 100% (498/498), done. remote: Compressing objects: 100% (284/284), done. remote: Total 4234 (delta 288), reused 350 (delta 193), pack-reused 3736 Receiving objects: 100% (4234/4234), 5.51 MiB | 8.17 MiB/s, done. Resolving deltas: 100% (2302/2302), done. Updating files: 100% (284/284), done.
C:\Test Dynamic>cd Azure-Samples/azure-search-openai-demo The system cannot find the path specified.
C:\Test Dynamic>cd azure-search-openai-demo
C:\Test Dynamic\azure-search-openai-demo>git checkout gh pr checkout 1159 error: pathspec 'gh' did not match any file(s) known to git error: pathspec 'pr' did not match any file(s) known to git error: pathspec 'checkout' did not match any file(s) known to git error: pathspec '1159' did not match any file(s) known to git
C:\Test Dynamic\azure-search-openai-demo>gh pr checkout 1159 'gh' is not recognized as an internal or external command, operable program or batch file.
C:\Test Dynamic\azure-search-openai-demo>git checkout 1159 error: pathspec '1159' did not match any file(s) known to git
C:\Test Dynamic\azure-search-openai-demo>git checkout srbalakr/int-vectorizer Switched to a new branch 'srbalakr/int-vectorizer' branch 'srbalakr/int-vectorizer' set up to track 'origin/srbalakr/int-vectorizer'.
C:\Test Dynamic\azure-search-openai-demo>npm install npm ERR! code ENOENT npm ERR! syscall open npm ERR! path C:\Test Dynamic\azure-search-openai-demo/package.json npm ERR! errno -4058 npm ERR! enoent ENOENT: no such file or directory, open 'C:\Test Dynamic\azure-search-openai-demo\package.json' npm ERR! enoent This is related to npm not being able to find a file. npm ERR! enoent
Could you kindly guide me on how to obtain the code locally, and if there are specific steps I should follow? Additionally, is there an alternative platform or method where we could connect for quicker responses or assistance?
Your guidance in achieving this dynamic data ingestion workflow would be highly valuable.
Thank you for your assistance.
It looks like you did manage to get the branch checked out, but then you ran npm install from the root folder. There's no package.json there, so it failed. The only folder where you'd ever run that would be app/frontend.
Once you have it checked out, you'd need to then run-
azd auth login azd env set USE_FEATURE_INT_VECTORIZATION true azd up
Hi @pamelafox , I tried checking out the branch in #1159 , I did as you directed and tried deploying the same using azd up but ended up in an error below.
C:\Test Dynamic\azure-search-openai-demo\scripts\prepdocs.py:402: DeprecationWarning: There is no current event loop
loop = asyncio.get_event_loop()
Processing files...
Using local files in C:\Test Dynamic\azure-search-openai-demo/data/
Ensuring search index gptkbindex exists
Creating gptkbindex search index
Traceback (most recent call last):
File "C:\Test Dynamic\azure-search-openai-demo\scripts\prepdocs.py", line 408, in
Deploying services (azd deploy)
(✓) Done: Deploying service backend
Alternatively, I tried creating "Import and vectorize data" on portal itself but no luck, I am getting the results on portal but webapp seems to be lost without any indexed document.
Please help me fix this end to end. My objective is to just upload the docs in azure blob and boom I should start getting a response.
@shivam10u I am going to test out the branch myself soon, so I'll see if I have the same error. cc @srbalakr
I feel this error is due to already exiting indexer with the same name under two search services in the portal and ThankYou @pamelafox , Please keep me posted, I am really excited to listen something good. @srbalakr Do you have any suggestions/fix?
@shivam10u , I got the same error message, the last few lines are
azure.core.exceptions.HttpResponseError: (InvalidRequestParameter) The request is invalid. Details: definition : Unknown vectorizer name 'myOpenAI' in Vector Search Profile 'embedding_config'.
Code: InvalidRequestParameter
Message: The request is invalid. Details: definition : Unknown vectorizer name 'myOpenAI' in Vector Search Profile 'embedding_config'.
Exception Details: (UnknownVectorSearchVectorizerConfiguration) Unknown vectorizer name 'myOpenAI' in Vector Search Profile 'embedding_config'. Parameters: definition
Code: UnknownVectorSearchVectorizerConfiguration
Message: Unknown vectorizer name 'myOpenAI' in Vector Search Profile 'embedding_config'. Parameters: definition
ERROR: failed running post hooks: 'postprovision' hook failed with exit code: '1', Path: '/tmp/azd-postprovision-1998154611.sh'. : exit code: 1
Hi @pamelafox , Did you get a chance to look into it?
Hi @pamelafox , Did you get a chance to try this? Also when can we expect this PR to be merged as this another important feature so that we can avoid splitiing,chunking and indexing locally.
[notice] To update, run: C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\scripts\python.exe -m pip install --upgrade pip Running "prepdocs.py" ModuleNotFoundError: No module named '_cffi_backend' thread '' panicked at C:\Users\runneradmin.cargo\registry\src\index.crates.io-6f17d22bba15001f\pyo3-0.18.3\src\err\mod.rs:790:5:
Python API call failed
note: run with
from azure.identity.aio import AzureDeveloperCliCredential
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity__init.py", line 10, in
from ._credentials import (
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity_credentials__init.py", line 5, in
from .authorization_code import AuthorizationCodeCredential
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity_credentials\authorization_code.py", line 9, in
from .._internal.aad_client import AadClient
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity_internal\ init__.py", line 5, in
from .aad_client import AadClient
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity_internal\aad_client.py", line 11, in
from .aad_client_base import AadClientBase
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity_internal\aad_client_base.py", line 20, in
from .aadclient_certificate import AadClientCertificate
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\azure\identity_internal\aadclient_certificate.py", line 7, in
from cryptography import x509
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\cryptography\x509\ init__.py", line 7, in
from cryptography.x509 import certificate_transparency
File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts.venv\lib\site-packages\cryptography\x509\certificate_transparency.py", line 11, in
from cryptography.hazmat.bindings._rust import x509 as rust_x509
pyo3_runtime.PanicException: Python API call failed
PS C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo>
RUST_BACKTRACE=1
environment variable to display a backtrace Traceback (most recent call last): File "C:\sampleproject Version 2.0\Latest sampleproject 2.0 on project1098\azure-search-openai-demo\scripts\prepdocs.py", line 7, inPLEASE Note- Cffi and all requirement.txt is installed. it works fine with azd up but not manually.