Open mcfadden8 opened 3 months ago
It hangs even when the NnfDataMovement resource in kubernetes shows that it's finished? Can you check that once you make it hang?
This part of the API has always bothered me because I think a good API should always respond as quickly as possible to the client to minimize wait time and also confirm that nothing is wrong. It's like asking someone a question and they never respond.
Is this something that you use a lot?
How do I check that? Do you happen to have a test for this? Under what circumstances does it work?
I was only attempting to use it because the documentation said that I could. I reverted back to polling with a one-second timer. But we have use cases where users just want to wait until the copy is done before proceeding.
How do I check that? Do you happen to have a test for this? Under what circumstances does it work?
As it's running (and presumably hanging), you can query the NnfDataMovement resource in k8s. You won't be able to do this in your application unless the compute nodes have k8s access, but you could do it from somewhere that does. This is basically what the DataMovementStatusRequest is doing for you:
kubectl get -n <rabbit-hostname> nnfdatamovements <request UID>
So if compute-node-1 was attached to rabbit-node-1 and the DataMovementCreateRequest
returned a UID of nnf-dm-node-5vghx
, you can do this to query it:
$ kubectl get nnfdatamovement -n rabbit-node-1 nnf-dm-node-5vghx
NAME STATE STATUS ERROR AGE
nnf-dm-node-5vghx Finished Success 4m54s
A MaxWaitTime
of -1 is not going to respond until that nnfdatamovement
is done. So if it's a large request, it's going to appear to hang since the response won't come until it's finished. I'm hoping that's what happening here. If the nnfdatamovement
resource is showing Finished and it's not responding, then we have an issue.
I reverted back to polling with a one-second timer. But we have use cases where users just want to wait until the copy is done before proceeding.
I think this is the best way to do this. It ensures that the server is responding and isn't hung.
The documentation says: "", but the data movement status request never call never returns.
The same call will work if I pass 1 second and continue to poll for between 5 and 10 seconds.