Azure-Samples / azure-batch-samples

Azure Batch and HPC Code Samples
Other
261 stars 487 forks source link

C# Basic Tutorial #239

Closed wbirsak closed 5 years ago

wbirsak commented 6 years ago

Hi All,

I am using the dotnet example (modified with a gui) to create a batch with one job and multiple tasks. The task does only the movement of the file to the node and back to storage.

When I try to create a pool it seems like the application stalls, when I look in the portal, I see a batch is being created, then started and then it turns to idle. The program does not continues, it does create the job and tasks...

When I press pause, I can't see where he is at the call stack.

Any help is welcome.

Wesley

darylmsft commented 6 years ago

Does the UI app look like its deadlocked? Look for a .wait call... classic "deadlock on captured context".

You will need to find the exact line that causes the program to not continue before we can help much.

wbirsak commented 6 years ago

Hi Daryl,

Yes, it's seems frozen.

This part is being executed:

            Owner.Pool = batchClient.PoolOperations.CreatePool(
                poolId: poolId,
                targetDedicatedComputeNodes: 1,
                virtualMachineSize: "small",            
                cloudServiceConfiguration: new CloudServiceConfiguration(osFamily: "4"));   

            Owner.Pool.StartTask = new StartTask
            {
                CommandLine = "cmd /c (robocopy %AZ_BATCH_TASK_WORKING_DIR% %AZ_BATCH_NODE_SHARED_DIR%) ^& IF %ERRORLEVEL% LEQ 1 exit 0",
                ResourceFiles = resourceFiles,
                WaitForSuccess = true
            };

            await Owner.Pool.CommitAsync();

The pool is created, nod is idle.

Thanks in advance for your help.

wbirsak commented 6 years ago

Hi, I removed all the async calls, and it will now continue. I now ran into an exception while creating the task.

batchClient.JobOperations.AddTask(jobId, tasks);

results into: Addition of a task failed with unexpected status code. Details: TaskId=MaskTask_0, Status=ClientError, Error.Code=InvalidPropertyValue, Error.Message=The value provided for one of the properties in the request body is invalid. RequestId:bb86fea1-5a3d-4a0f-a58f-4fe703590ad9 Time:2018-03-12T20:42:17.6440817Z, Error.Values=[PropertyName=blobSource, PropertyValue=https://testing.blob.core.windows.net/upload/1 - Copy (10).zip?sv=2017-07-29&sr=b&sig=W5AmGC12KC2AV5W0MyThJ98MCZRX%2FV2LmJJhnWw8xlo%3D&se=2018-03-12T22%3A42%3A16Z&sp=r, Reason=The specified BlobSource is not a valid URI

now looking into this error. when I copy paste (above is altered) into a browser it can be downloaded.

darylmsft commented 6 years ago

I did not mean to suggest that removing async calls was the appropriate answer. Generally we encourage async patterns. However, that technology comes with extra diligence in message pumping contexts like UI/aspx to avoid deadlock on the captured context. An all-synchronous solution is "fine" but less modern and performant in some circumstances.

Browsers are very "helpful" with the strings put into the address bars. Try loading the string into the constructor of a URI...

wbirsak commented 6 years ago

Ok. I know but I was trying to make it work. I wil test it. What are the main benefits of using async calls (I am quite new to this).

Thanks for your help.

wbirsak commented 6 years ago

Hi, copy and paste into a uri and it seems to be correct. I saw some difference in the uri of the ones that seems to work, on the end: %sp=w and the one that fails had: %sp=r

darylmsft commented 6 years ago

Well the primary example for a win for the async pattern... is UI :) I would suggest looking up the examples from the C# blogs. The downside is the deadlock when inartfully transitioning to a synchronous pattern (.wait, .result etc). In a more generalized case, the asyn pattern allows IO to be offloaded: think "iocompletion port" in windows... and away from threaded IO.

Not seeing your code, I would guess the deadlock occured in the "previous stackframe"...where a .wait is/was probably being called. But it is hard to tell w/o seeing the full sources. The deadlock is not a trivial topic and managing the captured context is usually trivial... unless the code is running in UI/aspx... like here.

It is also worth noting that exception handling with await becomes substantially more complicated. Again, the nuances of the AggragateException are a bit out of scope for us here. Be aware try/catch with AE might look good but can be hiding/ignoring exceptions in certain circumstances.

xingwu1 commented 6 years ago

The uri includes space (' ') character, so you had better escape the url before you pass it. sp is for SAS key permission. Since you are using the blob as resource file, you should provide the read permission which is sp=r

xingwu1 commented 6 years ago

About UI frozen after calling async/await function: please refer to this blog to fix it: https://blogs.msdn.microsoft.com/pfxteam/2011/01/13/await-and-ui-and-deadlocks-oh-my/.

wbirsak commented 6 years ago

Thanks all for tips, will look into this. First want to make work then make it work with async call.

wbirsak commented 6 years ago

Hi, everything seems to be working properly (synchronous) but I have now the same issue uri spaces when I try to upload file part in the task example. File name like "Test1.zip" works but file name like "Test 1.zp" wil cause the task (upload) to fail.

What is the best approach to fix this?

xingwu1 commented 6 years ago

When you generate the SAS url for the file, the url should be like https://..../Test%201.zip?... You won't have problem to use this kind of url as resource file. The ' ' should be escaped as %20 in url.

wbirsak commented 6 years ago

Hi,

That part is ok, I have no issues getting the file to the compute node. I want to upload the file back to the output container. by calling this method in taksapplication: private static void UploadFileToContainer(string filePath, string containerSas)

filePath is local on the node, so it can have a ' ' in the file name.. but when I try to upload it raise an exception (invalid uri..) and the task fails. When a file has no spaces it works fine.

wbirsak commented 6 years ago

Hi, I found the problem. when a task is created the following command is added: string taskCommandLine = String.Format("cmd /c %AZ_BATCH_NODE_SHARED_DIR%\TaskApplication.exe {0} 3 \"{1}\"", inputFile.FilePath, outputContainerSasUrl);

{0} = inputFile.FilePath when for e.g. "test.zip" is passed then when the task is run: Args[0] = "test.zip" but when for e.g. "test 1.zip" is passed then when the task is run Args[0] = "test" Args[1] = "1.zip" and so on. Fix: replace {0} with \"{0}\"

alfpark commented 5 years ago

Closing as it looks like the issue has been resolved.