Closed lucaseckes closed 1 year ago
Thank you for your feedback. This has been routed to the support team for assistance.
@vipinnair22, can you investigate?
@ChunyuMSFT, @shuyums2, @QianqianNie could you please help with this?
Seems the different is how you setup the authentication. @lucaseckes Can you share more details about how the identity is setup on your compute instance vs pipeline?
Thanks.
The pipeline agent is a self-hosted agent. It seems from the documentation that the agent uses a Personal Access Token (PAT) to authenticate. For my compute instance, I use managed identities with Azure AD to download data from Azure storage.
Is that answering your question @FeiDeng ? This is all I know concerning authentication. If you need more information, let me know how to find it.
Thanks.
@lucaseckes , Thank you for the update. We want to make sure the pipeline agent's identity can access the Azure storage first. In theory, if it can access the Azure storage, URI should also work.
Before with the Azure ML SDK v1, I was using the method get_by_name
of the Dataset
class: https://learn.microsoft.com/en-us/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#azureml-core-dataset-dataset-get-by-name
It was working perfectly fine inside the azure pipelines so I guess the pipeline agent has access to Azure storage.
In case it hasn't access to Azure storage, can you please tell me how to give it access? Thanks
Hold on need to check with Pipeline team.
@lucaseckes by the way, do you have a a run Id for the failed runs? We want to check some logs.
Unfortunately, I can't give you the run id because of privacy reason. However, you should be able to reproduce the issue with the example I gave.
I think we mask the privacy related data in logs and run Id is auto-generated guid, So should be ok? The reason I ask for your run id is as this mostly related to how the credential is setup. We can't reproduce that part.
Hi @FeiDeng, these are the logs of the failed pipeline with the minimal example. I hope this will help you resolve the issue logs_5370.zip
Checked. I think we may need more logs to investigate more on this.
Hi @FeiDeng , what kind of logs do you want?
I can rerun the pipeline and give you the logs but I guess it will be the same.
Do you have any updates on this issue?
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github, @Azure/azure-ml-sdk.
Author: | lucaseckes |
---|---|
Assignees: | luigiw |
Labels: | `question`, `Machine Learning`, `Service Attention`, `customer-reported`, `needs-team-attention`, `CXP Attention` |
Milestone: | - |
We need the run_id or session Id. That is the only way we can check from our backend logs.
Picking this up together with Lucas.
Are there any details on how fsspec handles auth? When I run on an AML compute instance and use fsspec I get the following:
>> fs.ls()
Warning: Falling back to use azure cli login credentials.
If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.
My understanding is MsiAuthentication is v1? should it not be DefaultAzureCredential
or ManagedIdentityCredential
for v2? It seems like the job in our case is running forever because it is going to the interactive login option. However, on the same worker pool the v1 auth with the managed identity works flawless.
In either case I am not sure how I can "force" suggest using ManagedIdentity on a non-interactive job. Is there any documentation on how azureml-fsspec
handles auth flow in the background? It does not seem to use install https://github.com/fsspec/adlfs but use the data-prep? Some details here would be helpful.
Hi @lucaseckes, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!
@fdroessler Fsspec will check the datastore credential first. It wont use the interactive login if credentials are provided in the datastore setup. If it is credential-less datastore, interactive login is forced.
@FeiDeng could you elaborate on this? The managed identity on the AzureDevops agent has access to the datastore/dataset. This works without any problems with the v1
SDK. "Just" changing the SDK to v2 and using fsspec
results in the interactive login. How would the environment on an AzureDevops worker need to be setup such that a managed identity would be picked up? I can't find any details on the auth flow or other details on this site (https://learn.microsoft.com/en-us/azure/machine-learning/how-to-access-data-interactive?tabs=adls&view=azureml-api-2#access-data-from-a-datastore-uri-like-a-filesystem-preview)
@fdroessler , just to confirm, your identity have the access to workspace right? And how this the datastore connection setup from workspace. Which credentials it is using?
@FeiDeng yep that is how it worked with the v1 sdk so hence it must have access.
Allowed workspace managed identity access
Yes
Authentication type
Account key
That is very interesting. Trying to have repro for this case. So which type of datastore is used here? Also mind to share the runId? or session Id?
also which version of fsspec and azureml-core installed?
Ok so I have the following setup that might help you reproduce. In the pipeline below, stage Testv1
runs through without any issues, Testv2
ends up with the interactive login prompt:
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code XXXXXXX to authenticate.
Datastore type: Azure Blob Storage
Please share an e-mail address to which I can share the run and session ids.
content of test_v1.py
:
from azureml.core import Dataset, Workspace
from azureml.core.authentication import MsiAuthentication
import pandas as pd
from config import settings
msi_auth = MsiAuthentication()
workspace = Workspace(subscription_id=settings.subscription_id, resource_group=settings.resource_group, workspace_name=settings.workspace, auth=msi_auth)
dataset = Dataset.get_by_name(workspace, name="test_v1", version="1")
dataset.download(target_path="./test", overwrite=True)
data = pd.read_csv("./test/test.csv")
assert "test" in data.columns
content of requirements_v1.txt
:
azureml-core==1.48.0
azureml-pipeline==1.48.0
pandas==1.3.5
pydantic==1.10.5
content of test_v2.py
:
import pandas as pd
from config import DATA_PATH
# DATA_PATH points to the v2 azureml:// uri of the same file as above
# DATA_PATH=(
# f"azureml://subscriptions/{settings.subscription_id}/resourcegroups/"
# f"{settings.resource_group}/workspaces/{settings.workspace}/datastores/"
# f"{settings.datastore_name}/paths/flavor-optimisation/ingredients-data/"
# f"{settings.dataset_modified_date}/test.pkl.gz"
# )
data_asset = pd.read_pickle(DATA_PATH)
assert "test" in data_asset.columns
content of requirements_v2.txt
:
pandas==1.3.5
pydantic==1.10.5
azureml-fsspec==0.1.0b3
pytest>=7.1.3
content of azure-pipelines.yml
:
trigger:
- main
stages:
- stage: testv1
displayName: Testv1
pool: 'Build Agents'
jobs:
- job: testv1
displayName: TestV1
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.9'
- script: |
pip install --upgrade pip
displayName: 'Update pip'
- script: |
pip install -r requirements_v1.txt
displayName: 'Install development dependencies'
- script: |
python test_v1.py
displayName: 'Run the tests'
- stage: testv2
displayName: Testv2
pool: 'Build Agents'
jobs:
- job: testv2
displayName: Testv2
steps:
- task: UsePythonVersion@0
inputs:
versionSpec: '3.9'
- script: |
pip install --upgrade pip
displayName: 'Update pip'
- script: |
pip install -r requirements_v2.txt
displayName: 'Install development dependencies'
- script: |
python test_v2.py
displayName: 'Run the tests'
@FeiDeng any luck reproducing the above case on your end?
Thanks for the detail. Looks like this part is missing for fsspec. We will add this support soon. Thank you.
Hi @lucaseckes, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!
@FeiDeng any news on this? Can we keep this open?
Still actively working on this change. Will let you when released.
Describe the bug I need to access data from azure storage during inference time (i.e. download Azure ML v2 data assets). I am following the steps from this documentation: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-access-data-interactive?tabs=adls#access-data-from-a-datastore-uri-like-a-filesystem-preview
I am using the method
read_pickle
of pandas to access the data using its URI because the data is stored as a pickle file.The code is running perfectly fine in a matter of seconds on my compute instance. However, when used inside azure-pipelines, the pipeline runs until timeout (set to 1h).
When I look at the logs of the azure pipelines I get this message:
To Reproduce I created a minimal example to reproduce the issue:
requirements.txt
config.py
azure-pipelines.yml
test.py
run_tests.sh
Additional context This related issue is about the best practice to download data assets locally: https://github.com/Azure/azure-sdk-for-python/issues/26213