rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library
https://docs.rapids.ai/api/cugraph/stable/
Apache License 2.0
1.7k stars 301 forks source link

pylibcugraph needs to check all error codes from the C API and raise appropriate exceptions #3545

Open rlratzel opened 1 year ago

rlratzel commented 1 year ago

PR #3533 enables additional checks in the C++ library which may result in errors being returned to pylibcugraph. pylibcugraph currently is not checking for these errors, and the uncaught exception from C++ results in a crash with a stack trace from C++, showing the following:

E   RuntimeError: non-success value returned from cugraph_sg_graph_create_from_csr(): CUGRAPH_UNKNOWN_ERROR cuGraph failure at file=/__w/cugraph/cugraph/cpp/src/structure/create_graph_from_edgelist_impl.cuh line=907: Invalid input arguments: graph_properties.is_symmetric is true but the input edge list is not symmetric.

Instead, pylibcugraph should catch exceptions and/or check all error codes from the C API and raise appropriate exceptions.

ChuckHastings commented 1 year ago

We discussed this briefly. There are, potentially, two issues here and I want to make sure we distinguish them.

  1. The PLC code should properly handle any of the error conditions that come back from the C API and be sure to propagate errors as meaningfully as possible. I believe this this issue should focus on this part of the problem.
  2. The C API does not always return an error code that is easy to interpret. Specifically, there is logic in the C API that will catch any unhandled exception and propagate it up as CUGRAPH_UNKNOWN_ERROR (as in this specific example). We should start making a list of CUGRAPH_UNKNOWN_ERROR situations that come up and identify where we might want additional error checks/error return codes in the C API. If there are situations that come up that can have special handling in the python layer they should definitely be returned as a specific error to make that easier. This, I believe, should be outside of the scope of this issue.