Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.52k stars 2.76k forks source link

MLTable Register Data Asset from Compute Instance Azure Machine Learning #33418

Open ilhamfadhil14 opened 9 months ago

ilhamfadhil14 commented 9 months ago

Describe the bug I attempted to create an MLTable from local files stored on an Azure ML compute instance and to register it as a Data Asset. However, after registration, it could not be loaded

To Reproduce

import mltable

FILE_PATH = [{"pattern" : "./data/parquet-file.parquet"}] # ganti untuk menyimpan nama file yang sesuai

tbl = mltable.from_parquet_files(FILE_PATH)

from azure.ai.ml import MLClient
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
from azure.identity import DefaultAzureCredential

# set VERSION variable
VERSION="1"
SUBSCRIPTION_ID=<subs-id>
RESOURCE_GROUP=<rg-name>
WORKSPACE_NAME=<ws-name>

CREDENTIAL = DefaultAzureCredential()

# connect to the AzureML workspace
# NOTE: the subscription_id, resource_group, workspace variables are set
# in the previous code snippet.
ml_client = MLClient(
    credential=CREDENTIAL, 
    subscription_id=SUBSCRIPTION_ID, 
    resource_group_name=RESOURCE_GROUP, 
    workspace_name=WORKSPACE_NAME
)

my_data = Data(
    path=MLTABLE_PATH,
    type=AssetTypes.MLTABLE,
    description="data asset from parquet",
    name="data_asset_name",
    version=VERSION,
)

ml_client.data.create_or_update(my_data)

data_asset = ml_client.data.get("data_asset_name", version="1")

tbl = mltable.load(data_asset.path)

Screenshot

image

Additional context I have tried change path naming to load MLTable when create it not change the final outcome

swathipil commented 9 months ago

Hi @ilhamfadhil14 - Thanks for the detailed report. We'll take a look asap!

@azureml-github

github-actions[bot] commented 9 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

github-actions[bot] commented 9 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

tawan0109 commented 7 months ago

@ilhamfadhil14 thanks for reporting this issue, would you please try the latest version of mltable (1.6.0) and see if the issue still persists?

sutharzan-ch commented 6 months ago

I am also getting the same issue when trying to register MLTable as a data asset from Azure ML Studio web interface. I created the ML table in my workspace using Notebook and once I try to load the data asset its still pointing to the ML table in my local workspace rather than pointing to the blob I uploaded it to. I am using Python 3.10 - SDK v2 Kernel in the ML studio web interface. Not sure how to set mltable latest version. Since, I am not managing the compute should it already be the latest version?