databricks / databricks-sql-go

Golang database/sql driver for Databricks SQL.
Apache License 2.0
34 stars 37 forks source link

Error while scanning column of type `array<decimal(10,2)>` into Go type `[]float64` #213

Open esdrasbeleza opened 2 months ago

esdrasbeleza commented 2 months ago

I have a table that has a column AMOUNTS with a type array<decimal(10,2)>, which I expect to scan into a field type []float64. This is the test I wrote to reproduce it:

func (s *DatabricksRepositorySuite) Test_Float64_Conversion() {
    query := `SELECT AMOUNTS FROM components WHERE AMOUNTS IS NOT NULL AND array_size(AMOUNTS) > 0 LIMIT 1`
    var result []float64
    err := s.repository.db.QueryRow(query).Scan(&result)
    s.NoError(err)
}

When I run my query, I get this error:

sql: Scan error on column index 0, name "AMOUNTS": unsupported Scan, storing driver.Value type string into type *[]float64

Source

This error comes from the standard library in Go: the value for that field is provided by the SDK as a string, but the destination is []float64, so it fails. I also dove into Databricks' SDK source code and found the place [2] where the string is generated.

[1] Golang source - convertAssignRows [2] Databricks SDK - listValueContainer.Value

Workaround

I created a custom type for the fields I have that use arrays:

import (
    "database/sql/driver"
    "encoding/json"
    "errors"
)

type databricksArray[T any] struct {
    parsedValue []T
}

func (f *databricksArray[T]) Scan(value interface{}) error {
    switch typedValue := value.(type) {
    case []T:
        f.parsedValue = typedValue
        return nil
    case string:
        return f.parseString(typedValue)
    default:
        return errors.New("error while handling databricks type")
    }
}

func (f *databricksArray[T]) parseString(value string) error {
    // This is a workaround for Databricks returning float arrays as strings
    // Example: `[1.0, 2.0, 3.0]`
    // We need to convert it to an array
    // This should be removed if/when Databricks fixes this issue
    valueBytes := []byte(value)
    var parsedValue []T
    if err := json.Unmarshal(valueBytes, &parsedValue); err != nil {
        return err
    }
    f.parsedValue = parsedValue
    return nil
}

func (f *databricksArray[T]) Value() (driver.Value, error) {
    return f.parsedValue, nil
}

func (f *databricksArray[T]) Slice() []T {
    if f == nil {
        return nil
    }
    return f.parsedValue
}

So my field

Amount []float64 `db:"AMOUNTS"`

is now

Amount databricksArray[float64] `db:"AMOUNTS"`

and I call Amount.Slice() when I want to read its value.