Azure / azure-kusto-spark

Apache Spark Connector for Azure Kusto
Apache License 2.0
77 stars 34 forks source link

ExtendedKustoClient: Some extents were not processed and we got an empty move result'1' Please open issue if you see this trace. At: https://github.com/Azure/azure-kusto-spark/issues #375

Open liangchenmicrosoft opened 4 months ago

liangchenmicrosoft commented 4 months ago

Describe the bug We are using Synapse Spark to write data into Kusto table in Python. When we enable 'drop-tag' with SparkIngestionProperties, we will see below error message in Synapse Spark.

2024-05-01 22:16:56,492 INFO TokenLibrary [Timer-14474]: Obtained Access token from cache
2024-05-01 22:16:56,532 INFO Utilities$ [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Trying to determine if cluster type
2024-05-01 22:16:56,532 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Attempting to get params from node config
2024-05-01 22:16:56,533 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Call to get Access token
2024-05-01 22:16:56,533 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Number of callers waiting for lock_token to access token service= 0
2024-05-01 22:16:56,533 INFO InMemoryCacheClient [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Token successfully fetched from in-memory cache
2024-05-01 22:16:56,533 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Obtained Access token from cache
2024-05-01 22:16:56,580 FATAL KustoConnector [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: ExtendedKustoClient: Some extents were not processed and we got an empty move result'1' Please open issue if you see this trace. At: https://github.com/Azure/azure-kusto-spark/issues
2024-05-01 22:16:56,580 INFO Utilities$ [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Trying to determine if cluster type
2024-05-01 22:16:56,580 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Attempting to get params from node config
2024-05-01 22:16:56,580 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Call to get Access token
2024-05-01 22:16:56,580 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Number of callers waiting for lock_token to access token service= 0
2024-05-01 22:16:56,581 INFO InMemoryCacheClient [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Token successfully fetched from in-memory cache
2024-05-01 22:16:56,581 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Obtained Access token from cache
2024-05-01 22:16:56,626 INFO Utilities$ [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Trying to determine if cluster type
2024-05-01 22:16:56,627 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Attempting to get params from node config
2024-05-01 22:16:56,627 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Call to get Access token
2024-05-01 22:16:56,627 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Number of callers waiting for lock_token to access token service= 0
2024-05-01 22:16:56,627 INFO InMemoryCacheClient [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Token successfully fetched from in-memory cache
2024-05-01 22:16:56,627 INFO TokenLibrary [Executor task launch worker for task 0.0 in stage 3.0 (TID 6)]: Obtained Access token from cache

To Reproduce Steps to reproduce the behavior:

Expected behavior A clear and concise description of what you expected to happen.

This error shouldn't happen with drop-tag property enabled in SparkIngestionProperties.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

ag-ramachandran commented 4 months ago

Hello @liangchenmicrosoft Will have a look for that , I would need the clusterURL, the spark runtime version and the options you are using to troubleshoot the issue. You can message that on my IM handle ramacg at ms

While you do that

please try and use

.option("writeMode","Queued")

and test this as well, this was a relatively new change that was added in the connector to overcome some limitations. Please use this and let us know if it works