Open jhellan opened 10 years ago
Hmm, that's a weird one. My only guess at this point would be that the response from the server is mangled somehow. I'd love to see the response, but 3 times out of 1100 it seems unlikely. To make it easier to capture, edit the save_changes
method to verify the result is an array, if not, dump the result to a text file or something. Is that something you could try? I can add a check in there and do something similar. E.g. throw a custom exception, and/or log the output, but I'd like to be able to reproduce the "actual conditions" to test the problem/solution. I have no real idea at this point what would cause that to happen.
Thanks. I've now instrumented our code. We'll see what that turns up.
We noticed that for batch saves and operations besides 'Add', save_changes returns a boolean. We also see that we get this error a fraction of a second after an earlier failure from AddToVirtualMachines. Is it possible that a failed save may leave behind a save operation in the queue, so that the next save gets turned into a batch save?
We've looked for races in our own code, but the block { AddToVirtualMachines; save_changes } is protected by a mutex. I don't see cases where we call Add but don't call save_changes, either.
We managed to get a log which made sense. We checked for non-array in single_save, and for batch saves. Our application doesn't do batch saves intentionally.
We got one save_changes call with op AddToVirtualMachines which resulted in "Server returned error but no message. (OData::ServiceError)". There was no associated http_code.
Turned out that the next save_changes call was indeed performed as a batch_save, which returned true. The @save_operations.length was 2.
Is the save_operations queue supposed to be cleared after the first exceptions? If not, how could we clean it up from the application? Would we have to instantiate a new service object?
It doesn't clear the queue after an exception, as you found out. The reasoning here was that if you had 5 changes and one bombed out, you could potentially remove that call. At this point though, there isn't a way to modify the @save_operations
array. This should be changed. Perhaps simply exposing that instance variable would suffice, and/or a method to clear out the array.
To reset things, at this point, I believe you'd need to instantiate a new service object like you said.
What would you recommend for a fix? Just expose the array for now?
Hmm. I think it would be cleaner to make single save and batch save two different operations. If a single save fails, it should be cleared, and it will be up to the user to resubmit.
In a batch situation, can the user tell which ops succeeded and which failed? If he can't, there's obviously no point in keeping the queue around. If he can, I guess it would be best to expose the clear queue operation.
Hi
Not strictly an issue, but I did not find an answer in the docs. What are the possible return values from 'save_changes'? I was assuming an array, possibly empty, or an exception. But we have the following lines in our code, where spf_vmm_service is an OData::Service instance:
but we occasionally see the error
from the second line. It looks like save_changes has returned 'true'. Is that possible, and do you have any idea why?
This is isn't easy for us to reproduce right now - it happened just three times during a 60 hr test run which hit this code abt 1100 times.
Do you have any ideas about this?
Jon Kåre Hellan, UNINETT AS, Trondheim, Norway