Open myonlylonely opened 4 days ago
I guess there is a WaitLock() within THttpAsyncServerConnection.AsyncResponse which would ensure that the main DoRequest thread is finished.
But perhaps the flags may indeed not be properly set. Try with my next commit.
Thanks for quick reply. My question is how to write proper callback function? For example:
procedure TRawAsyncServer.OnAsyncDb(Statement: TSqlDBPostgresAsyncStatement;
Context: TObject);
var
ctxt: THttpServerRequest absolute Context;
begin
if (Statement = nil) or
not Statement.Step then
ctxt.ErrorMessage := 'asyncdb failed'
else
ctxt.SetOutJson('{"id":%,"randomNumber":%}',
[Statement.ColumnInt(0), Statement.ColumnInt(1)]);
ctxt.OnAsyncResponse(ctxt);
end;
OnAsyncResponse
do have a lock to ensure it execute after DoRequest
, but before I execute OnAsyncResponse
, I would like to change the output of the request, like the example above. Is the ErrorMessage
, SetOutJson
could also cause race conditions?DoRequest
lock release too late?Concurrent access to the THttpServerRequest could indeed be a problem. But in practice, result fields (ErrorMessage, or SetOutJson) are perfectly fine. Only changing the flags or the status code may be an issue. This is the reason why there is a RespStatus
optional parameter to the OnAsyncResponse
event handler. And I suspect we don't need to change the flags.
The lock has a timeout only waiting for the current main connection thread to set its state. So the delay should be almost minimal, and certainly always < 1ms. So this part should be safe.
Anyway, the race condition is very unlikely to happen, because the main connection thread is just executing a few lines of non-blocking code with no syscall, whereas the async call is much complex, involving the network layer and a lot of syscalls. It may happen only on a really overloaded system, with numerous threads and all cores at 100%. In which case, the problems will appear elsewhere than in our web server part. Unless you did actually see such a race condition appear on production or during testing?
I'm afraid it did happen during testing. These threads are executed on different cores, and it is not the only program running on that core. So it did happen, but like you said, very low chance. Is that possible we could add another callback that executed after calling OnAsyncResponse, make sure the user defined callback function it self is also protected by the locks?
Please try with my latest commits:
If the task finshed too early, it could still cause problem. The client never get the response.
function TRawAsyncServer.async(ctxt: THttpServerRequest): cardinal;
begin
ExecuteAsync(ctxt, OnAsync);
result := ctxt.SetAsyncResponse;
//If the Task finished here, after SetAsyncResponse but before the ResponseFlag is set, the response will never sent to the client, and the connection is kept waiting until client timeout because of the flag check.
end;
I don't think so. The TryLock should wait for ResponseFlag to be set.
Oh, you are right. I'm looking at the old version, which is LTS 2.3. The flag check is moved to the lock protection in the mater branch.
I have another question. If client has a timeout setting, which disconnect before the data is sent, the connection object would be freed. The callback function then called with the ctxt pointer which owns some garbage. How should I write safe code in something like this?
This should be handled by
if IsDangling or
(Sender <> fRequest) or
(fClosed in fFlags) then
exit;
Perhaps https://github.com/synopse/mORMot2/commit/1a03df57 would help a little more, if the request pointer is reused between connections. At least ConnectionID should be genuine per connection. Edit: no, this won't harden anything because both pointers are equal.
How could IsDangling
function really check for dangling pointers?
It is a raw pointer just like in C++, the memory could be resued for other objects, even read only access TPollAsyncConnection.fHandle
variable itself could possibly cause an access violation in that case.
Or even worse, the memory was not reused yet and kept that value, the object is freed, and still could access because the connection id is the same.
The connection ID won't be the same with a new connection. There is a GC of closed connections, so the pointers are not freed instantly, but reused for another connection. In which case, I have added a fAsyncConnectionID32 field to ensure this is the actual request, not a recycled connection. And even if it has been released and reused for anything else after 2 seconds of GC, the recycled pointer will also be detected.
If the async task process takes 10 seconds(just for example), and the request object is freed. The pointer to the request object which I saved earlier could point to a memory that could already been allocated for other code, not even by mORMot. In this case, the code access could cause access violation, could pass the check or not, anything could happen. Actually, there is no way to actually detect a dangling pointer with raw pointer. And even there is a way to detect danging pointer, I have to write logic(such as setOutJson) before OnAsyncResponse, that skipped your check(which is wrong any way). Actually, if you use any 3rd memory manager under debug mode (for example, FastMM under FullDebugMode), it will throw access violation for sure.
The following example, Sleep simulate a task that took a long time. and before the task is finshed, the client could disconnect the connection, causing connection recycled, after 2 seconds, the object is freed. Then the callback has a dangling pointer ctxt. If you use FastMM5 and enable FullDebugMode, OnAsyncResponse will throw an access violation for sure. Because FastMM force the recycled memory to be written some garbage flag, cauing the acess violation to be triggered. In real cases, these memory could be allocated by other code or not immediately reused, so the crash is more likely to be appeared random.
function TServer.asyncdb(ctxt: THttpServerRequest): cardinal;
var
aTask: ITask;
begin
aTask := TTask.Create(
procedure
begin
Sleep(10000);
G_Server.OnAsyncDb(ctxt);
end);
aTask.Start;
Result := ctxt.SetAsyncResponse;
end;
procedure TServer.OnAsyncDb(Context: TObject);
var
ctxt: THttpServerRequest absolute Context;
begin
ctxt.OnAsyncResponse(ctxt);
end;
The new code would detect such issue, and avoid any GPF.
But you are right: the main point is that you could access ctxt
out of scope for sure in end-user code, before mORMot code itself.
So the GPF would occur before.
Actually,if you enabled FastMM5 with FullDebugMode,you will see the access violation also happened inside AsyncResponse with no user code before that. Again, there is no way to detect dangling pointer with raw pointer. And please do some research about dangling pointers, there is no way to do proper detect that using raw pointers since the memory released could be reused for literally anything.
Perhaps a real demo would help.
I have created a THttpAsyncServer in the demo, listen to 8080 port. Click the "Test Req" button on the form will make a request to the server, and has a timeout 1 second. The request will Sleep 8 seconds, then call the OnAsyncResponse
function. It has no user code in the callback, just calling OnAsyncResponse
itself.
And I enabled FastMM5 under FullDebugMode. There will be a access violation inside OnAsyncResponse
every time.
I have uploaded the full test project, including all it's dependencies. Just Open the project in Delphi, and compile it. mORMot2 and FastMM5 are pulled from the latest commit in master branch.
HTTPServerTest.zip
I have some problems with the techempower-bench example. I think there is a race condition between the DoRequest thread and the thread which execute the callback function. Please check if I am right or wrong.