MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.28k stars 21.47k forks source link

Modin not compatible with AML? #91850

Closed lon-tierney closed 1 year ago

lon-tierney commented 2 years ago

Upon attempting to change pandas import within a current AML compute node, the import is not found. Per the Modin site, pip install modin caused several errors which lead to believe it is incompatible with current AML compute node implementations:

ERROR: pyldavis 3.3.1 requires sklearn, which is not installed. ERROR: pandas-ml 0.6.1 requires enum34, which is not installed. ERROR: fbprophet 0.7.1 requires cmdstanpy==0.9.5, which is not installed. ERROR: responsibleai 0.17.0 has requirement ipykernel<6.0, but you'll have ipykernel 6.6.0 which is incompatible. ERROR: raiwidgets 0.17.0 has requirement ipykernel<6.0, but you'll have ipykernel 6.6.0 which is incompatible. ERROR: raiwidgets 0.17.0 has requirement jinja2==2.11.3, but you'll have jinja2 2.11.2 which is incompatible. ERROR: pyldavis 3.3.1 has requirement numpy>=1.20.0, but you'll have numpy 1.19.0 which is incompatible. ERROR: pycaret 2.3.9 has requirement pyyaml<6.0.0, but you'll have pyyaml 6.0 which is incompatible. ERROR: pycaret 2.3.9 has requirement scikit-learn==0.23.2, but you'll have scikit-learn 0.22.1 which is incompatible. ERROR: pandas-profiling 3.1.0 has requirement joblib~=1.0.1, but you'll have joblib 0.14.1 which is incompatible. ERROR: datasets 1.8.0 has requirement tqdm<4.50.0,>=4.27, but you'll have tqdm 4.63.1 which is incompatible. ERROR: dask-sql 2022.4.0 has requirement dask[dataframe,distributed]<2022.4.1,>=2022.3.0, but you'll have dask 2.30.0 which is incompatible. ERROR: azureml-training-tabular 1.40.0 has requirement pandas==1.1.5, but you'll have pandas 1.4.1 which is incompatible. ERROR: azureml-train-automl-runtime 1.40.0.post1 has requirement pandas==1.1.5, but you'll have pandas 1.4.1 which is incompatible. ERROR: azureml-contrib-notebook 1.40.0 has requirement nbconvert<6, but you'll have nbconvert 6.4.5 which is incompatible. ERROR: azureml-automl-runtime 1.40.0 has requirement pandas==1.1.5, but you'll have pandas 1.4.1 which is incompatible. ERROR: autokeras 1.0.16 has requirement tensorflow<=2.5.0,>=2.3.0, but you'll have tensorflow 2.2.0 which is incompatible.

Are there updated instructions for using Modin, or other suggestion for dealing with large files (8+ GB) for data exploration?

[Enter feedback here]


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

YutongTie-MSFT commented 2 years ago

Thanks for the feedback! We are currently investigating and will update you shortly.

YutongTie-MSFT commented 2 years ago

@sdgilley Hello Sheri, could you please help to check if there any update for the support of Modin, I see the last update of the document is 11/04/2021, could you please check if there any recent update cause the confilct or any change according to the supportibility, Thanks a lot.

sdgilley commented 2 years ago

assign: @samuel100

bjorhn commented 2 years ago

Is there an update regarding this issue? It seems impossible to get Modin (with ray) working properly using either pip or conda. I somehow managed to get it installed in a compute instance last week, which I am unable to reproduce, but even that instance produces an error when performing certain dataframe operations that I can find no references to anywhere. It seems to work okay with dask.

sdgilley commented 2 years ago

assign: @ssalgadodev

ssalgadodev commented 1 year ago

@lon-tierney Modin is able to be installed on a compute instance however, Azure ML or the product does not offer any guidance or support following the necessary dependencies to get Modin to run. We are not able to assist any further, but you are free to install Modin on the compute instance.

ssalgadodev commented 1 year ago

please-close