Open sivadotblog opened 1 month ago
UPDATE: I am able to get past this issue if I disable apply policy defaults. So I believe the policy defaults apply the node type id and driver type id. but i cant create a policy without those fields. So i believe this is still an issue
Description The issue arises when using the workspace_client.jobs.create() function to create a job with specific cluster settings. The function does not accept both node_type_id and instance_pool_id parameters simultaneously; it only accepts one of them. However, the JobCluster class, which is used to define the cluster settings, includes both node_type_id and instance_pool_id by default. I cant remove them.
Reproduction Code Context The job_settings dictionary, which is passed to workspace_client.jobs.create(), includes a job_clusters key. This key uses the JobCluster class to define the cluster specifications. Here is a simplified version of the relevant code:
''' job_settings = { "name": name, "tasks": tasks, "job_clusters": job_clusters, "timeout_seconds": timeout_seconds, }
job_clusters = [ JobCluster( apply_policy_default_values=True, autoscale=AutoScale(max_workers=None, min_workers=1), custom_tags={'application-id': '0001818'}, data_security_mode=DataSecurityMode.SINGLE_USER, driver_instance_pool_id='1220-224524-shore2-pool-0dlxvf9c', instance_pool_id='1220-224524-shore2-pool-0dlxvf9c', spark_version='15.4.x-scala2.12', spark_conf={'spark.databricks.delta.preview.enabled': True}, ) ]
workspace_client.jobs.create(**job_settings) '''
Expected behavior The job is created with the instance pool id provider
Is it a regression? nope tested with 0.17.0
Debug Logs
the job fails with the error databricks.sdk.errors.platform.invalidparametervalue "the field node id cannot be supplied when an instance pool id is provided.
as you can see, am not passing the node_type_id. the class JobCLuster defaults it to none.
Additional context when I did a dir(of the jobcluster) i do see that node_type_id and driver node type id set to none.