ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.19k stars 527 forks source link

rbd: use internal as default error code in getGRPCError() #4671

Closed Rakshith-R closed 2 weeks ago

Rakshith-R commented 2 weeks ago

Describe what this PR does

This commit replaces codes.Unknown with codes.Internal as the default error code in getGRPCError() since it much better suited.

Checklist:


Show available bot commands These commands are normally not required, but in case of issues, leave any of the following bot commands in an otherwise empty comment in this PR: * `/retest ci/centos/`: retest the `` after unrelated failure (please report the failure too!)
nixpanic commented 2 weeks ago

Hi @Rakshith-R,

Can you explain why Internal is much better suited than Unknown for an unrecognized error? Does this influence the behavior of CSI callers on these errors?

Rakshith-R commented 2 weeks ago

Hi @Rakshith-R,

Can you explain why Internal is much better suited than Unknown for an unrecognized error? Does this influence the behavior of CSI callers on these errors?

Unknown and Internal both will get retries from the callers.

We don't use Unknown anywhere else. Most command execution failures are treated as internal and not unknown elsewhere in cephcsi. We should default to that instead of Unknown.

Rakshith-R commented 2 weeks ago

Hi @Rakshith-R, Can you explain why Internal is much better suited than Unknown for an unrecognized error? Does this influence the behavior of CSI callers on these errors?

Unknown and Internal both will get retries from the callers.

We don't use Unknown anywhere else. Most command execution failures are treated as internal and not unknown elsewhere in cephcsi. We should default to that instead of Unknown.

https://github.com/ceph/ceph-csi/commit/cdaa9264eb4ee1698efa99ab48224049e7f7c64f

This commit introduced unknown grpc error code. Without using internal as default, we would need to export each error message and add it to the list every time a new code block it added.

nixpanic commented 2 weeks ago

Hi @Rakshith-R, Can you explain why Internal is much better suited than Unknown for an unrecognized error? Does this influence the behavior of CSI callers on these errors?

Unknown and Internal both will get retries from the callers. We don't use Unknown anywhere else. Most command execution failures are treated as internal and not unknown elsewhere in cephcsi. We should default to that instead of Unknown.

cdaa926

This commit introduced unknown grpc error code. Without using internal as default, we would need to export each error message and add it to the list every time a new code block it added.

Except that it is easy to forget to add a new error code (we should have unit tests for that), considering a matching gRPC error code is probably better than always returning Internal?

This change is not bad, but I am not sure how it improves things. In order to prevent the Unknown error from being returned, maybe log a warning marked as bug with call-stack so that it needs to be addressed?

Rakshith-R commented 2 weeks ago

Hi @Rakshith-R, Can you explain why Internal is much better suited than Unknown for an unrecognized error? Does this influence the behavior of CSI callers on these errors?

Unknown and Internal both will get retries from the callers. We don't use Unknown anywhere else. Most command execution failures are treated as internal and not unknown elsewhere in cephcsi. We should default to that instead of Unknown.

cdaa926 This commit introduced unknown grpc error code. Without using internal as default, we would need to export each error message and add it to the list every time a new code block it added.

Except that it is easy to forget to add a new error code (we should have unit tests for that), considering a matching gRPC error code is probably better than always returning Internal?

This change is not bad, but I am not sure how it improves things. In order to prevent the Unknown error from being returned, maybe log a warning marked as bug with call-stack so that it needs to be addressed?

I think the whole way of creating constants of errors in rbd package, exporting them and using them in csiaddons package should be used only when you need a grpc error code that is not internal.

Adopting the export and use method for internal code too would be way too tedious according to me.

Most of the places we error out with internal error code. We don't use unknown code at all in other places. Please take a look, I've replaced all the error messages which used internal error code.

Rakshith-R commented 2 weeks ago

@mergifyio queue

mergify[bot] commented 2 weeks ago

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at *ec8017512f46b4f4c6093628ea9d437495130a84*
ceph-csi-bot commented 2 weeks ago

/test ci/centos/k8s-e2e-external-storage/1.27

ceph-csi-bot commented 2 weeks ago

/test ci/centos/k8s-e2e-external-storage/1.29

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e-helm/k8s-1.27

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e-helm/k8s-1.29

ceph-csi-bot commented 2 weeks ago

/test ci/centos/k8s-e2e-external-storage/1.30

ceph-csi-bot commented 2 weeks ago

/test ci/centos/upgrade-tests-cephfs

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e/k8s-1.29

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e/k8s-1.27

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e-helm/k8s-1.30

ceph-csi-bot commented 2 weeks ago

/test ci/centos/upgrade-tests-rbd

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e/k8s-1.30

ceph-csi-bot commented 2 weeks ago

/test ci/centos/k8s-e2e-external-storage/1.28

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e-helm/k8s-1.28

ceph-csi-bot commented 2 weeks ago

/test ci/centos/mini-e2e/k8s-1.28