Azure / MachineLearningNotebooks

Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft
https://docs.microsoft.com/azure/machine-learning/service/
MIT License
4k stars 2.49k forks source link

Creating a file dataset from a single directory in datastore requires azureml-dataset-runtime? #1912

Open Chiranmb opened 1 year ago

Chiranmb commented 1 year ago

I am trying to create a file dataset from a single directory in datastore. Im following the code block from

https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.data.dataset_factory.filedatasetfactory?view=azure-ml-py#azureml-data-dataset-factory-filedatasetfactory-from-files

Specifically,

from azureml.core import Dataset, Datastore

 # create file dataset from a single file in datastore
 datastore = Datastore.get(workspace, 'workspaceblobstore')

 # create file dataset from a single directory in datastore
 file_dataset_2 = Dataset.File.from_files(path=(datastore, 'image/'))

However, when I try to replicate these steps for my own Datastore, I encounter an Import Error

ImportError: Missing required package "azureml-dataset-runtime", which can be installed by running: "c:\Users\<user>\.conda\envs\<my-conda-env-name>\python.exe" -m pip install azureml-dataset-runtime --upgrade

I am on Python 3.11.3 and I tried installing azureml-dataset-runtime but I encounter a dependency clash which requires me to downgrade to Python 3.8.

Furthermore, from the PyPI page

https://pypi.org/project/azureml-dataset-runtime/

It states that azureml-dataset-runtime is "is internal, and is not intended to be used directly."

Is this intended? I am trying to mount my data for a custom ML training job, using the as_mount function from the FileDataset Class. Please let me know if there is a better alternative to mounting data, or am I forced to use Python 3.8?


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.