snowflakedb / gosnowflake

Go Snowflake Driver
Apache License 2.0
302 stars 125 forks source link

SNOW-1660331: Should SnowflakeFileTransferOptions.getFilesToStream be exported member? #1207

Closed shinya-ml closed 1 month ago

shinya-ml commented 2 months ago

Please answer these questions before submitting your issue. In order to accurately debug the issue this information is required. Thanks!

  1. What version of GO driver are you using?

v1.11.1

  1. What operating system and processor architecture are you using?

macOS 13.2.1 arm64

  1. What version of GO are you using? run go version in your console

1.21.1

4.Server version:* E.g. 1.90.1 You may get the server version by running a query:

SELECT CURRENT_VERSION();

8.33.1

  1. What did you do?

    If possible, provide a recipe for reproducing the error. A complete runnable program is good.

I would like to retrieve the files stored in the internal stage using the GET operation, which was supported in version 1.11.1, and output the contents of the files to the standard output. I referred to the example in the gosnowflake godocs, but when using gosnowflake as an external module, I found that I cannot access the getFilesToStream member of the SnowflakeFileTransferOptions struct.

package main

import (
    "bytes"
    "compress/gzip"
    "context"
    "database/sql"
    "fmt"
    "io"
    "log"
    "os"

    "github.com/snowflakedb/gosnowflake"
)

func main(){
        dsn, err := gosnowflake.DSN(&gosnowflake.Config{//omitted})
    if err != nil {
        log.Fatal(err)
    }

    db, err := sql.Open(
        "snowflake",
        dsn,
    )
    if err != nil {
        log.Fatal(err)
    }

    if err := db.Ping(); err != nil {
        log.Fatal(err)
    }
    var streamBuf bytes.Buffer
    ctx := gosnowflake.WithFileTransferOptions(context.Background(), &gosnowflake.SnowflakeFileTransferOptions{}) // <-- We cannot access getFilesToStream
    ctx = gosnowflake.WithFileGetStream(ctx, &streamBuf)
    if _, err := db.ExecContext(ctx, "GET @my_stage file://tmp/"); err != nil {
        log.Fatal(err)
    }
    br := bytes.NewReader(streamBuf.Bytes())
    gr, err := gzip.NewReader(br)
    if err != nil {
        log.Fatal(err)
    }
    defer gr.Close()
    io.Copy(os.Stdout, gr)
    fmt.Println()
}
  1. What did you expect to see?

    What should have happened and what happened instead?

I expected the content of the file obtained via the GET operation to be streamed to the standard output, but the contents of streamBytes only contained EOF. If it is necessary to set getFilesToStream to true, then a means to access it is required.

  1. Can you set logging to DEBUG and collect the logs?

    https://community.snowflake.com/s/article/How-to-generate-log-file-on-Snowflake-connectors

  2. What is your Snowflake account identifier, if any? (Optional)

sfc-gh-dszmolka commented 2 months ago

hi and thanks for reporting this issue, taking a look

sfc-gh-dszmolka commented 2 months ago

thank you so much for the example you provided, it helped a lot in troubleshooting this issue. I setup a stage, uploaded a test.csv with following content

$ cat test.csv
id,firstname,lastname
1,John,Smith
2,Jill,Smith

Then confirmed indeed SnowflakeFileTransferOptions.getFileToStream isn't accessible, being unexported :(

./main.go:39:109: unknown field getFileToStream in struct literal of type gosnowflake.SnowflakeFileTransferOptions

Fixing it in a local fork (only this single field) then replaceing vanilla gosnowflake in go.mod allowed me to proceed to the next error (2024/09/11 14:38:58 gzip: invalid header) so i had to modify the repro program a little bit, took out compressed stream handling:

# cat main.go 
package main

import (
    "bytes"
//  "compress/gzip"
    "context"
    "database/sql"
    "fmt"
    "io"
    "log"
    "os"

    "github.com/snowflakedb/gosnowflake"
)

func main(){
    cfg, err := gosnowflake.GetConfigFromEnv([]*gosnowflake.ConfigParam{
        {Name: "Account", EnvName: "SNOWFLAKE_TEST_ACCOUNT", FailOnMissing: true},
        {Name: "User", EnvName: "SNOWFLAKE_TEST_USER", FailOnMissing: true},
        {Name: "Password", EnvName: "SNOWFLAKE_TEST_PASSWORD", FailOnMissing: true},
    })
    dsn, err := gosnowflake.DSN(cfg)
    if err != nil {
        log.Fatalf("failed to create DSN from Config: %v, err: %v", cfg, err)
    }

    db, err := sql.Open(
        "snowflake",
        dsn,
    )
    if err != nil {
        log.Fatal(err)
    }

    if err := db.Ping(); err != nil {
        log.Fatal(err)
    }
    var streamBuf bytes.Buffer
    ctx := gosnowflake.WithFileTransferOptions(context.Background(), &gosnowflake.SnowflakeFileTransferOptions{GetFileToStream: true}) 
    ctx = gosnowflake.WithFileGetStream(ctx, &streamBuf)
// observe local file nomenclature, it's 'file://<path>' so in case '/tmp' is the dest path, then it's 3 '/'s
    if _, err := db.ExecContext(ctx, "GET @test_db.public.mystage file:///tmp/"); err != nil {
        log.Fatal(err)
    }
    br := bytes.NewReader(streamBuf.Bytes())
    /*
    gr, err := gzip.NewReader(&streamBuf)
    if err != nil {
        log.Fatal(err)
    }
    defer gr.Close()
    */
    //io.Copy(os.Stdout, gr)
    io.Copy(os.Stdout, br)
    fmt.Println()
}

now it gives:

# go run main.go 
id,firstname,lastname
1,John,Smith
2,Jill,Smith

Hope it helps you. About the unexported fields; we definitely need to look at them again because I'm sure some of them should be accessible from outside the module by their main purpose 😅 like GetFileToStream.

sfc-gh-dszmolka commented 2 months ago

PR in draft https://github.com/snowflakedb/gosnowflake/pull/1208 edit under review

shinya-ml commented 2 months ago

Thank you for investigating the issue. I'm glad we were able to identify issues other than getFilesToStream. I look forward to the resolution of the problems.

sfc-gh-dszmolka commented 2 months ago

change is merged and will be part of the next upcoming release cycle

sfc-gh-dszmolka commented 1 month ago

released with gosnowflake v1.11.2, in September 2024 release cycle