Azure / Kusto-Lightingest

Kusto Lightingest tool
MIT License
2 stars 4 forks source link

Retry on failure #11

Closed randomaccess3 closed 3 days ago

randomaccess3 commented 2 months ago

Im new to lightingest to apoligies if this is covered elsewhere but I couldnt see it in the docs.

When trying to ingest files from my local machine I occasionally get errors like "[InnerException: KustoClientTemporaryStorageRetrievalException]: Failed to retrieve temporary storage". When I come back later and manually ingest the files they work fine, and based on the error it's something to do with Azure. I haven't tried to move the files to an Azure blob and do the ingest that way.

It would be great if there was a commandline argument for retries which I think might deal with the issue where an exception is thrown because the temporary storage isnt available.

Thanks

ohadbitt commented 2 months ago

Hi This might be an issue with the Direct ingest approach (ingest directly to query service ) . Do you have more details in your exception? Could you confirm which type of service you provide in the command line arguments ? (query service is the main url, ingest service has "ingest-" in the url)

randomaccess3 commented 2 months ago

i dont have the exact command right now but it was basically

lightingest to database X, and look for Y.csv files in the source path Z

It works perfectly most of the time, just some of the larger csvs it throws an error saying the temporary azure storage blob wasnt ready. It continues the ingest and then I just save out the errored lines and manually run the command again but change the sourcepath to be the folder with the CSV that failed to upload and it works just fine. So my thought process would be that if there was a way to track the stuff that failed to upload due to an "KustoClientTemporaryStorageRetrievalException" then a short break and an attempted reupload would work

ohadbitt commented 2 months ago

Well we will need to check which flow invokes this Can you please do get the command you used ? specifically the cluster url?

randomaccess3 commented 2 months ago

.\LightIngest.exe "https://-adx.australiaeast.kusto.windows.net;Fed=true" -database:databasename -table:tablename -sourcePath:"\filepath\for_ingest" -format:csv -pattern:filename.csv -dontWait:true -i:false -ignoreFirst

This command works fine for the most part, just that on the larger CSVs a few will return the error saying the temporary storage blob created on azure wasnt available. If I then take the same command, and run it again (but adjust the input path specifically for the ones that failed) then it's fine.

AsafMah commented 1 month ago

Does the cluster URL start with "ingest-" or not?

randomaccess3 commented 1 month ago

Yes, the ingest URL is correct.

The command that i was using works perfectly, except sometimes it throws errors on some files because whatever temporarily Azure blob it wants to use doesnt like it. So then I have to rerun it and it works. I just end up having the move the csvs that failed to upload the first time elsewhere, and then the upload of the subset of CSVs works perfectly.

Mostly what I'm after is a way to keep track of files that errored out due to that exception (KustoClientTemporaryStorageRetrievalException) and then rerun x number of times based on an argument.

elirandav commented 1 month ago

Can you please post here the full stack trace? @randomaccess3

randomaccess3 commented 1 month ago

Sure, I dont know how to generate these though. Can you please help me generate it?

elirandav commented 1 month ago

Sure, I dont know how to generate these though. Can you please help me generate it?

There is a newer version with more detailed logs in this flow. It hasn’t been officially released yet, so I’ll try to reproduce it on my machine using my Kusto cluster.

This command works fine for the most part, just that on the larger CSVs a few will return the error

How large is the CSV file?

Also, could you please add a new connection in Kusto Explorer or Kusto Web using the connection URL: https://ingest-<your cluster URL> After that, run the following command and specify the number of lines where ResourceTypeName is equal to TempStorage?

randomaccess3 commented 1 month ago

Will have to try later in the week.

I had say 10-15 csvs around 300mb each. It would occasionally fail on a few.

So id pull the ones it failed on and run the same command and it would succeed. Nothing changed except that it was uploading individually. While lines up with the error message that the failure was because azures temporary storage wasn't ready.

ohadbitt commented 1 month ago

Fixed in version 12.2.6 https://github.com/Azure/Kusto-Lightingest/releases/tag/12.2.6

elirandav commented 1 month ago

Could you please try using the new version 12.2.6 and let us know if you still encounter this uninformative error? @randomaccess3

AsafMah commented 3 days ago

The version with the fix was released a month ago - we are closing the issue.

Feel free to re-open if anything arises.