yorkie-team / yorkie

Yorkie is a document store for collaborative applications.
https://yorkie.dev
Apache License 2.0
771 stars 143 forks source link

Server panics due to convert unhashable struct to connnect.Error #944

Closed blurfx closed 1 month ago

blurfx commented 1 month ago

What happened: During the process of handling errors by converting them to connect.Error, a panic occurs when a non-hashable error, such as mongo.CommandError, is passed.

The issue arises at the following line:

https://github.com/yorkie-team/yorkie/blob/b1f9e0053ab2af8e4f0f9b8b699f80cc1e821ff0/server/rpc/connecthelper/status.go#L154

What you expected to happen: The server should not panic and handle non-hashable errors gracefully.

How to reproduce it:

This issue can be reproduced in the same way as yorkie-team/yorkie#943

  1. Pass a non-hashable error, such as mongo.CommandError, in the error handling process.
  2. Observe the panic occurring when trying to handle the non-hashable error.

Anything else we need to know:

For example, mongo.CommandError struct is defined like below, and type bson.Raw(alias of []byte) and []string is unhashable. So mongo.CommandError is unhashable struct.

type CommandError struct {
    Code    int32
    Message string
    Labels  []string // Categories to which the error belongs
    Name    string   // A human-readable name corresponding to the error code
    Wrapped error    // The underlying error, if one exists.
    Raw     bson.Raw // The original server response containing the error.
}

Environment:

blurfx commented 1 month ago

Unhashable errors are not going to have a common interface, so it's probably better to handle panic in that context with recover.

Considering just mongo for example, mongo.CommandError and mongo.WriteError, mongo.WriteException, and mongo.BulkWriteException do not have a common field, so interface checking is not possible.

krapie commented 1 month ago

We have experienced this issue while we are doing Weekly Sync. This killed the server, so we couldn't access to the document for a while (about few seconds).

image

I think we need to prioritize this issue and fix it ASAP.