Closed lon-tierney closed 1 year ago
Thanks for the feedback! We are currently investigating and will update you shortly.
@sdgilley Hello Sheri, could you please help to check if there any update for the support of Modin, I see the last update of the document is 11/04/2021, could you please check if there any recent update cause the confilct or any change according to the supportibility, Thanks a lot.
assign: @samuel100
Is there an update regarding this issue? It seems impossible to get Modin (with ray) working properly using either pip or conda. I somehow managed to get it installed in a compute instance last week, which I am unable to reproduce, but even that instance produces an error when performing certain dataframe operations that I can find no references to anywhere. It seems to work okay with dask.
assign: @ssalgadodev
@lon-tierney Modin is able to be installed on a compute instance however, Azure ML or the product does not offer any guidance or support following the necessary dependencies to get Modin to run. We are not able to assist any further, but you are free to install Modin on the compute instance.
Upon attempting to change pandas import within a current AML compute node, the import is not found. Per the Modin site, pip install modin caused several errors which lead to believe it is incompatible with current AML compute node implementations:
ERROR: pyldavis 3.3.1 requires sklearn, which is not installed. ERROR: pandas-ml 0.6.1 requires enum34, which is not installed. ERROR: fbprophet 0.7.1 requires cmdstanpy==0.9.5, which is not installed. ERROR: responsibleai 0.17.0 has requirement ipykernel<6.0, but you'll have ipykernel 6.6.0 which is incompatible. ERROR: raiwidgets 0.17.0 has requirement ipykernel<6.0, but you'll have ipykernel 6.6.0 which is incompatible. ERROR: raiwidgets 0.17.0 has requirement jinja2==2.11.3, but you'll have jinja2 2.11.2 which is incompatible. ERROR: pyldavis 3.3.1 has requirement numpy>=1.20.0, but you'll have numpy 1.19.0 which is incompatible. ERROR: pycaret 2.3.9 has requirement pyyaml<6.0.0, but you'll have pyyaml 6.0 which is incompatible. ERROR: pycaret 2.3.9 has requirement scikit-learn==0.23.2, but you'll have scikit-learn 0.22.1 which is incompatible. ERROR: pandas-profiling 3.1.0 has requirement joblib~=1.0.1, but you'll have joblib 0.14.1 which is incompatible. ERROR: datasets 1.8.0 has requirement tqdm<4.50.0,>=4.27, but you'll have tqdm 4.63.1 which is incompatible. ERROR: dask-sql 2022.4.0 has requirement dask[dataframe,distributed]<2022.4.1,>=2022.3.0, but you'll have dask 2.30.0 which is incompatible. ERROR: azureml-training-tabular 1.40.0 has requirement pandas==1.1.5, but you'll have pandas 1.4.1 which is incompatible. ERROR: azureml-train-automl-runtime 1.40.0.post1 has requirement pandas==1.1.5, but you'll have pandas 1.4.1 which is incompatible. ERROR: azureml-contrib-notebook 1.40.0 has requirement nbconvert<6, but you'll have nbconvert 6.4.5 which is incompatible. ERROR: azureml-automl-runtime 1.40.0 has requirement pandas==1.1.5, but you'll have pandas 1.4.1 which is incompatible. ERROR: autokeras 1.0.16 has requirement tensorflow<=2.5.0,>=2.3.0, but you'll have tensorflow 2.2.0 which is incompatible.
Are there updated instructions for using Modin, or other suggestion for dealing with large files (8+ GB) for data exploration?
[Enter feedback here]
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.