Closed t2y closed 1 year ago
I can't follow your thoughts. Removing the mutex check from GetLastError
is not ideal in my opinion. Can you explain/do you know what caused the messageMutex to hang? Because a slow reader and therefore a blocked buffered channel is something we cannot account for.
@cpuschma I changed GetLastError() method to lock for searchResponse.Next()
in #440. See below.
So I just revert my previous changes since searchResponse.Next()
doesn't handle it.
Another problem. This failure is rarely a race condition, even if the messageMutex locks. I guess it doesn't reproduce after running again.
https://github.com/go-ldap/ldap/actions/runs/5599882841/jobs/10241454178?pr=447
We cannot guarantee that the the Connection.err
is guarded atomically without a mutex. Again, that's not an option.
I still don't see what the exact problem is, maybe because of the language barrier. Lets break things down:
What am I missing here? :sweat:
Connection.err is guarded atomically without a mutex. Again, that's not an option.
I see. I agree with using mutex to get Connection.err
. But the previous code didn't have mutex before I worked #440. Do you think it still needs mutex?
After #440, two goroutine access to Connection.err
, and rarely it occurs a data race condition (see above CI failure), and sometimes it occurs a deadlock issue. Because searchResponse.Next() access to Connection.err
. As the same as you, I thought no problem when I was working on #440, but it's difficult to imagine since it's a timing issue.
You can probably reproduce easily. This is how to reproduce in my environment.
SearchAsync()
to get about a thousand entriesAdditionally, searchResponse.Next() doesn't need to access Connection.err
. At that time, I just thought it was a shortcut to get an error. If this code does not exist, the error occurs in the search process next time.
Agreed @t2y :+1:
Admittedly, the missing mutex for GetLastError was an oversight of me. At the time it wasn't designed for parallelism like SearchAsync
in mind. Can you adjust your branch to keep the mutex but remove the GetLastError
calls from the Next
method?
@cpuschma Thanks for your review. I follow you.
The previous change is merged from #440.
I'm implementing #422 using the search asynchronous feature. I found a deadlock issue with the low-sized buffered channel in my local environment. When the buffered channel is full, a goroutine waits to send to the channel. But sometimes, searchResponse.Next() occurs deadlock since the messageMutex cannot get a lock in Conn.GetLastError().
While working at #440, I found Conn.err is a race condition. So I added messageMutex.Lock() code, which causes a deadlock issue with certain conditions (e.g., a low-sized buffered channel). That makes a more complicated problem.
~This code is young, and we can revert quickly.~