Open gitgud5000 opened 2 weeks ago
Hi @gitgud5000, thanks for opening this issue and sorry you had a bumpy experience. We will look into this shortly.
I'm having a similar problem where the issue appears when running on a cluster, but not running locally or on a compute instance.
Hi @ArmandoRl1, could you give more details on your setup? @gitgud5000 already gave a good writeup but the more information we have about this the better.
Description
I have an issue running Kedro with![Pasted image 20240613004553](https://github.com/kedro-org/kedro/assets/17186026/cc17e410-c421-4a82-b0f7-e94e55db755c)
ThreadRunner
to execute the following pipeline:The primary layer shown in the Kedro Viz above is a series of 21
SQLScriptDataset
objects (apandas.sql_dataset.SQLQueryDataset
subclass which formats input queries in a special way using parameters in the catalog and then callssuper().__init__
).This Kedro pipeline is triggered as part of a
CommandJob
in Azure Machine Learning (AML), using acommand_job.py
which runs a Kedro session with something like this:Problem/Error
After most or all of the datasets in the primary layer are loaded, SQLAlchemy produces the following error:
Context
In AML, these jobs can be run on two types of compute: a Compute Instance, which is an Ubuntu VM used for development, and Clusters, which are managed infrastructures that allow for the creation of single/multi-node computes for deployment.
When executing the
CommandJob
, essentially runningkedro run
withThreadRunner
on a Cluster, the job fails. However, this issue does not occur when running the same job on a Compute Instance, or when run locally from source usingkedro run
.These command jobs run with the same environment image in both cases.
Steps to Reproduce
CommandJob
to run the Kedro pipeline withThreadRunner
.Expected Result
I would expect the job to run successfully on the cluster, as it does on other compute instances with the same configuration.
Actual Result
The job fails with the following error:
Attempts to Resolve
Using a small
max_workers
in the runner configuration.Used connection parameters for the engine
Tried different engine parameters, including:
Different Oracle (ugh, I know) drivers
Different versions of
oracledb
andcx-Oracle
no luck.Logs
Here is a log file of a run with![Pasted image 20240613014731](https://github.com/kedro-org/kedro/assets/17186026/0cc90714-5f1a-403e-9bc4-d0d8f099316d)
'echo_pool': 'debug'
and a similar setup, with 5SQLScriptDataset
as input. Running in AzureML.logYour Environment
kedro
version: 0.19.6kedro-datasets
version: 3.0.0cx-Oracle
version: 8.3.0Standard_D16_v3