go-ldap / ldap

Basic LDAP v3 functionality for the GO programming language.
Other
2.19k stars 352 forks source link

searchAsync checks cancelled context only after receiving an ldap response #495

Closed SaschaRoland closed 3 months ago

SaschaRoland commented 3 months ago

I have a use case with long lasting ldap requests to retrieve many single entries from a ldap server. To stay responsive for application shutdown, ... during such a query, searchAsync is used with a Context, that will be cancelled once the query has to be terminated early.

This works in principle, but the ldap server also shows the behaviour that it can take a long period of time (minutes) between the returned sinlge search results. Unfortunately, cancelling the async search is also blocked during such periods, resp. the cancelled context will not be recognized until some event from the server side occurs.

Regarding the implementation in response.go, this is probably caused by waiting for new responses via the default case in the according select statement:

func (r *searchResponse) start(ctx context.Context, searchRequest *SearchRequest)
    go func() {
        ....
        foundSearchSingleResultDone := false
        for !foundSearchSingleResultDone {
            select {
            case <-ctx.Done():
                r.conn.Debug.Printf("%d: %s", msgCtx.id, ctx.Err().Error())
                return
            default:
                r.conn.Debug.Printf("%d: waiting for response", msgCtx.id)
                packetResponse, ok := <-msgCtx.responses
        ....    

So, the check for ctx.Done() only happens between blocking reads on msgCtx.responses.

I guess, handling new repsonses in a distinct case would solve this problem:

func (r *searchResponse) start(ctx context.Context, searchRequest *SearchRequest)
    go func() {
        ....
        foundSearchSingleResultDone := false
        for !foundSearchSingleResultDone {
                r.conn.Debug.Printf("%d: waiting for response", msgCtx.id)
            select {
            case <-ctx.Done():
                r.conn.Debug.Printf("%d: %s", msgCtx.id, ctx.Err().Error())
                return
            case packetResponse, ok := <-msgCtx.responses:
        ....    

As far as I can see, there is no reason against using a specific case, but I'm far from having sufficient overview of the code. Could someone explain, whether there is a need for the default case? Many thanks in advance.

t2y commented 3 months ago

Until last week, I used the searchAsync() feature asynchronously by goroutine in my environment (So I didn't notice the blocking). I changed the process synchronously. Then, as it happens, I found that the Close() process was not completed during syncrepl communication.

err := c.conn.Close()

This issue might be related to what I encountered. I'm going to investigate today and report them after that.

t2y commented 3 months ago

I added the select statement as below PR.

https://github.com/go-ldap/ldap/pull/440/files#diff-70edbe70dd2a92eac7b18d6af5a3cc0539c167987a1d5c0d40c9451cd0f41513R105

I indicate this as just non-blocking select, so I also think the suggested statement is better.

case packetResponse, ok := <-msgCtx.responses:
t2y commented 3 months ago

About my comment https://github.com/go-ldap/ldap/issues/495#issuecomment-2028948164 , I found another issue between an application that uses this library and this library. Because sendResponse() doesn't expect persistent search like syncrepl, messageMutex locks until getting the response. I think that it will be fixed by passing context from an application. I will try to fix #326.

cpuschma commented 3 months ago

Fixed by PR #496

SaschaRoland commented 3 months ago

That's been quick, thanks a lot :+1: