go-goracle / goracle

Go database/sql driver for connecting to Oracle Database, using the ODPI-C library
273 stars 43 forks source link

[Question] Can we access the entire error stack? #204

Closed kurt-google closed 4 years ago

kurt-google commented 5 years ago

Recently noticed that I'm not sure how/if we can get the full error stack from goracle/odpi. When attempting to start a misconfigured database the error returned by goracle appears to only be the final error in the stack but sometimes earlier messages are useful. e.g.

goracle: goracle.Conn.Startup(goracle.StartupDefault) ORA-01078: failure in processing system parameters

sqlplus: startup nomount ORA-00821: Specified value of sga_target 1344M is too small, needs to be at least 1616M ORA-01078: failure in processing system parameters

Is it possible to support multiple errors like this?

tgulacsi commented 5 years ago

It is dead simple: returns what ODPI-C's dpiContext_getError returns in dpiErrorInfo:

func (c *conn) Startup(mode StartupMode) error {
    if C.dpiConn_startupDatabase(c.dpiConn, C.dpiStartupMode(mode)) == C.DPI_FAILURE {
        return errors.Errorf("startup(%v): %w", mode, c.getError())
    }
    return nil
}

func (d *drv) getError() *OraErr {
    if d == nil || d.dpiContext == nil {
        return &OraErr{code: -12153, message: driver.ErrBadConn.Error()}
    }
    var errInfo C.dpiErrorInfo
    C.dpiContext_getError(d.dpiContext, &errInfo)
    return fromErrorInfo(errInfo)
}
func fromErrorInfo(errInfo C.dpiErrorInfo) *OraErr {
    oe := OraErr{
        code:    int(errInfo.code),
        message: strings.TrimSpace(C.GoString(errInfo.message)),
    }
    if oe.code == 0 && strings.HasPrefix(oe.message, "ORA-") &&
        len(oe.message) > 9 && oe.message[9] == ':' {
        if i, _ := strconv.Atoi(oe.message[4:9]); i > 0 {
            oe.code = i
        }
    }
    oe.message = strings.TrimPrefix(oe.message, fmt.Sprintf("ORA-%05d: ", oe.Code()))
    return &oe
}
tgulacsi commented 5 years ago

To see what ODPI sees, run your test with "DPI_DEBUG_LEVEL=28" env.

In odpi/src/dpiConn.c, you can see that dpiConn_startupDatabase calls dpiOcidbStartup, that calls dpiErrorsetFromOCI in dpiError.c, and eventually, dpiOci__errorGet is called in dpiOci.c.

That calls OCIErrorGet. https://docs.oracle.com/cd/B10501_01/appdev.920/a96584/oci16m97.htm says that "This function can be called multiple times if there are more than one diagnostic record for an error." so maybe it should be called in a cylcle to gather all errors - you should ping @cjbj and @anthony_tuininga.

anthony-tuininga commented 5 years ago

From what I understand, OCIErrorGet only ever actually returns a single error message -- in spite of it saying that it can support multiple, there are no situations where it returns multiple. But if SQL*Plus is getting the information, perhaps I can as well. I'll take a look and get back to you on this.

kurt-google commented 5 years ago

Thanks for taking a look and giving me lots to look into. The docs above notes a recordin parameter which is only ever 1 in ODPI (https://github.com/oracle/odpi/blob/master/src/dpiOci.c#L1405), perhaps this is the missing element for traversing the error stack? But i dont see how you can see the count of error records from this interface.

currently I dont have an environment to build and test using OCI directly, so im only speculating.

kurt-google commented 5 years ago

The updated OCI docs appear to be more specific on how to retrieve multiple error records

https://docs.oracle.com/database/121/LNOCI/oci17msc007.htm#LNOCI17287

anthony-tuininga commented 4 years ago

If you call OCIErrorGet() with a non-existent record you get back OCI_NO_DATA (value 100). I set up a situation like the one you suggested and I see both error messages in SQLPlus but I only see the one when calling OCIErrorGet(). If I use the value of 2 I get back OCI_NO_DATA. So it would appear that SQLPlus is doing something "special". I'll see if there is a way to get at that information by asking internally.

cjbj commented 4 years ago

An update on behalf of @anthony-tuininga who dug into this. It does turn out to be similar to a case looked at previously, which is where it was definitively shown that OCIErrorGet() only ever actually returns a single error message.

In the startup case, only the one message is available via the OCI8 interface that ODPI-C uses. SQL*Plus is using something different (aka older and not available to ODPI-C). I logged Oracle bug 30524604 to improve OCI8.

kurt-google commented 4 years ago

Thanks for the follow up!

cjbj commented 4 years ago

@kurt-google FWIW, the Oracle bug has been fixed, and some future version of the Oracle client will contain the changes. If you can't wait, you can ask Oracle Support if a backport is possible.

(@tgulacsi I think this issue can be closed - no goracle / godror change is needed)