yorkie-team / yorkie

Yorkie is a document store for collaborative applications.
https://yorkie.dev
Apache License 2.0
783 stars 145 forks source link

Improve the client to be safe in multiple Goroutines #190

Open hackerwins opened 3 years ago

hackerwins commented 3 years ago

What would you like to be added:

Improve the client to be safe in multiple Goroutines.

Why is this needed:

Go Client is likely to be used on the server, so it is better to be safe in multiple routines, such as other database drivers.

shmruin commented 3 years ago

Could you tell me more specific on this topic? I checked clients has an Activate method, calling go DB's ActivateClient interface. client is actiavated in yorkie_server's ActivateClient. Do you mean that one client should support multiple database drivers at the same time? (supported by goroutines?)

hackerwins commented 3 years ago

@shmruin Thanks for your interest in this issue.

Users will manage the application's documents using Yorkie Go Client on the server just like a normal DB driver. DB drivers used by the web-based server that handles user requests should be multi-routine safe. If it's not safe in multiple routines, users need to create a new Client for every request.

When I implemented my application using Yorkie, I had to create a client for every request: https://github.com/metis-labs/metis-server/blob/280633d9541f98c776df5b1508ede1d25d8c5762/server/projects/projects.go#L44-L78

We can reduce the connection overhead if we reuse the client in all requests after activating the client.

shmruin commented 3 years ago

@hackerwins Thanks for your explanation. I got what this issue means. But still I need to make some points clear.

In Yorkie's structure, A client has a document. So, when it comes to the meaning of multi-routine safe, does it mean multiple modification should be safe in that client's document? and it guarantees you can use one client for every request?

(I think maybe this is due to my short understanding of metis-server's createProject meaning)

Do I miss something or get totally wrong way?

hackerwins commented 3 years ago

@shmruin The final goal of this issue is to make the Client and Document work safely in multi-routines.

I think we can do this issue step by step. How about starting this issue by protecting client's document-related states(client.attachments) when Attach, Detach, Sync, Watch, UpdateMetadata and PeersMapByDoc are executed in multi-routines.

shmruin commented 3 years ago

@hackerwins Hi, after the previous meet up, I check the points of Client that have any effect of write/read as I attach file here. The point is, that sync.mutex only in Client.attachment may not work. The problem is all fields in client is related, especially for the Client.client and Client.attachment. For example, In Attach method, c.client.AttachDocument sent and after all changePackage works, and finally local c.attachments changes. I'm anxious this would break the identicalness among the clients. So, How about using RWMutex in Client struct? It would be like below.

type Client struct {
    conn        *grpc.ClientConn
    client      api.YorkieClient
    dialOptions []grpc.DialOption

    id           *time.ActorID
    key          string
    metadataInfo types.MetadataInfo
    status       status
    attachments  map[string]*Attachment
    rwMutex      sync.RWMutex   // Read Write Mutex for Client
}
// Attach attaches the given document to this client. It tells the agent that
// this client will synchronize the given document.
func (c *Client) Attach(ctx context.Context, doc *document.Document) error {
    if c.status != activated {
        return ErrClientNotActivated
    }

    defer c.rwMutex.Unlock()    // Unlock defer

    c.rwMutex.Lock() // Write lock for this client

    doc.SetActor(c.id)

    pbChangePack, err := converter.ToChangePack(doc.CreateChangePack())
    if err != nil {
        return err
    }

    res, err := c.client.AttachDocument(ctx, &api.AttachDocumentRequest{
        ClientId:   c.id.Bytes(),
        ChangePack: pbChangePack,
    })
    if err != nil {
        log.Logger.Error(err)
        return err
    }

    pack, err := converter.FromChangePack(res.ChangePack)
    if err != nil {
        return err
    }

    if err := doc.ApplyChangePack(pack); err != nil {
        log.Logger.Error(err)
        return err
    }

    doc.SetStatus(document.Attached)
    c.attachments[doc.Key().BSONKey()] = &Attachment{
        doc:   doc,
        peers: make(map[string]types.MetadataInfo),
    }

    return nil
}
// Metadata returns the metadata of this client.
func (c *Client) Metadata() types.Metadata {

    defer c.rwMutex.RUnlock()

    c.rwMutex.RLock()   // Read lock case

    metadata := make(types.Metadata)
    for k, v := range c.metadataInfo.Data {
        metadata[k] = v
    }

    return metadata
}

Does this way too naive or make the whole process too slow? And still, the Watch and Sync is quite a big question... Hope some opinions here ;)

hackerwins commented 3 years ago

@shmruin Thanks for the detailed check.

I have inserted the attached excel table.

  Call Another Mehtods *grpc.ClientConn api.YorkieClient []grpc.DialOption *time.ActorID string types.MetadataInfo status map[string]*Attachment Multiple routines problems
conn client dialOptions id key metadataInfo status attachments
Client.Dial NewClient Create Client, wrapper function of NewClient Make a new client is thread-safe
Client.Close Deactivateconn.Close
(mutex inside gRpc conn)
Write     Read *     Read / Write *   Close while other routines Read/Write?=> Context closing required?Active <-> Deactive while other routines use?=> A/A, D/D already returns nil=> D/A, A/D OK?
Client.Activate client.ActivateClient   Write   Write     Write  
Client.Deactivate client.DeactivateClient   Write   Read     Read / Write  
Client.Attach client.AttachDocument   Write   Read     Read Write Attachment should be in write mutexClient Write always follow by Attachment Write=> Client.client Cannot be seperated from Attachment=> e.g.) Attachment is locked but Client wirte alread flys to other
Client.Detach client.DetachDocument   Write   Read     Read Write
Client.Sync sync   Write *           Read / Write *
Client.Watch client.WatchDocumentPeerMapByDoc   Write   Read   Read   Read / Write
Client.UpdateMetadata         Read   Read / Write Read Read Write mutex to metadatainfo
Client.ID         Read         Read Mutex (Free among reads, Lock if write)
Client.Key           Read      
Client.Metadata             Read    
Client.PeersMapByDoc                 Read
Client.IsActive               Read  
Client.sync client.Push
Pullattachment.doc.CreateChangePack
attatchment.doc.ApplyChangePack
  Write   Read     Read Read / Write Refer to client.Sync

Here are some facts and thoughts.

  1. The gRPC client, Client.client is safely be accessed concurrently: link. Therefore, we can trust the client in concurrent situations.

  2. The client has two types of methods. The first method types are used to change the client's state and the second method types are used to change the state of the document the client has. We need to increase the concurrency of the second method types.

The suggestion would be to use one RWLock and be simple, but with low concurrency in the second method types. How about introducing document lock and client lock separately?

shmruin commented 3 years ago

@hackerwins Understood, if gRPC is already concurrent, that would be way easier. I'll take your suggestion and gonna test with two locks.