Azure-Samples / graphrag-accelerator

One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
https://github.com/microsoft/graphrag
MIT License
1.65k stars 250 forks source link

[BUG] Python version/environment for notebooks #158

Open mkoivi opened 3 weeks ago

mkoivi commented 3 weeks ago

Describe the bug

Please define exact Python requirements for running notebooks.

Both notebooks fails with Azure ML compute instances.

For 'Python 3.10 - SDK v2' environment I get


ModuleNotFoundError Traceback (most recent call last) Cell In[2], line 9 6 from pathlib import Path 7 from zipfile import ZipFile ----> 9 import magic 10 import pandas as pd 11 import requests

ModuleNotFoundError: No module named 'magic'

For 'Python 3.8 - AzureML' env I get error:

TypeError Traceback (most recent call last) Cell In[8], line 7 1 def upload_files( 2 file_directory: str, 3 storage_name: str, 4 batch_size: int = 100, 5 overwrite: bool = True, 6 max_retries: int = 5, ----> 7 ) -> requests.Response | list[Path]: 8 """ 9 Upload files to a blob storage container. 10 (...) 19 (i.e. a few seconds before. The solution "in practice" is to sleep a few seconds and try again. 20 """ 21 url = endpoint + "/data"

TypeError: unsupported operand type(s) for |: 'type' and 'types.GenericAlias'

jgbradley1 commented 3 weeks ago

Thanks @mkoivi. For the Python 3.10 environment in AML, did you receive the ModuleNotFound error after running the pip install cell? magic is an interesting package because on some OS's, it requires the installation of a system package (i.e. apt-get install) in order for the python package to work. We use the package primarily as an automated safeguard to identify file encoding issues before users attempt to upload data to the accelerator.

In my limited research, magic seemed like the best package to classify file encoding. If you're aware of a better approach, I'd be happy to swap out magic for something else.

mkoivi commented 3 weeks ago

Thanks @mkoivi. For the Python 3.10 environment in AML, did you receive the ModuleNotFound error after running the pip install cell? magic is an interesting package because on some OS's, it requires the installation of a system package (i.e. apt-get install) in order for the python package to work. We use the package primarily as an automated safeguard to identify file encoding issues before users attempt to upload data to the accelerator.

In my limited research, magic seemed like the best package to classify file encoding. If you're aware of a better approach, I'd be happy to swap out magic for something else.

ModuleNotFound error occured when running the second cell, containing the import clauses. I'm not very familiar with Python libraries unfortunately, so I'd like to have as straightforward deployment as possible.