Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.6k stars 2.81k forks source link

Unable to load large delta table in azure ml studio #30750

Open mustafaali96 opened 1 year ago

mustafaali96 commented 1 year ago

I was trying to read a delta table from Azure ML. I have already created data assets to register the delta table, which is located at an ADLS Gen 2 location. However, when attempting to load the data, I have noticed that large data sizes are taking an exceedingly long time and still unable to load data, the cell keeps running for hours.

I have confirmed that for small data sizes, the data is returned within a few seconds, which leads me to believe there may be an issue with the scalability of the data-loading process. image

mccoyp commented 1 year ago

Hi @mustafaali96, thank you for opening an issue! I'll tag some folks who should be able to help and we'll get back to you as soon as possible. cc @azureml-github

github-actions[bot] commented 1 year ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @azureml-github @Azure/azure-ml-sdk @klaaslanghout.

humak commented 1 month ago

are there any provided updates to this issue? facing a the same problem here.

mustafaali96 commented 1 month ago

are there any provided updates to this issue? facing a the same problem here.

No @humak you can use pip install deltalake which is way faster for large datasets