I have a for-loop lightgbm fit job for rolling back validation;
The job failed on multi-node cluster with log error Connection Refused, and after checked the failed tasks, the executor failed with detail error message java.lang.ArrayIndexOutOfBoundsException and caused the Connection Refused error;
Meanwhile the job can run on single-node cluster without any issue.
The dataframe sent to model is around 48,000, with partition as below
Partition 0 has 19000 records
Partition 1 has 18000 records
Partition 2 has 7000 records
Partition 3 has 4000 records
And the issue cannot be fixed by df.repartition(5).
Hi @dciborow , I can see the fix PR is created, would like to check whether it will be available for com.microsoft.azure:synapseml_2.12:0.11.4-spark3.3 ? Thanks in advance.
SynapseML version
com.microsoft.azure:synapseml_2.12:0.11.4-spark3.3
System information
Describe the problem
I have a for-loop lightgbm fit job for rolling back validation; The job failed on multi-node cluster with log error
Connection Refused
, and after checked the failed tasks, the executor failed with detail error messagejava.lang.ArrayIndexOutOfBoundsException
and caused theConnection Refused
error;Meanwhile the job can run on single-node cluster without any issue.
The dataframe sent to model is around 48,000, with partition as below
Partition 0 has 19000 records Partition 1 has 18000 records Partition 2 has 7000 records Partition 3 has 4000 records
And the issue cannot be fixed by
df.repartition(5)
.Code to reproduce issue
Other info / logs
No response
What component(s) does this bug affect?
area/cognitive
: Cognitive projectarea/core
: Core projectarea/deep-learning
: DeepLearning projectarea/lightgbm
: Lightgbm projectarea/opencv
: Opencv projectarea/vw
: VW projectarea/website
: Websitearea/build
: Project build systemarea/notebooks
: Samples under notebooks folderarea/docker
: Docker usagearea/models
: models related issueWhat language(s) does this bug affect?
language/scala
: Scala source codelanguage/python
: Pyspark APIslanguage/r
: R APIslanguage/csharp
: .NET APIslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/synapse
: Azure Synapse integrationsintegrations/azureml
: Azure ML integrationsintegrations/databricks
: Databricks integrations