slingdata-io / sling-cli

Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
https://docs.slingdata.io
GNU General Public License v3.0
299 stars 16 forks source link

Panic while exporting Trino table to local parquet file #298

Closed jhatcher1 closed 1 month ago

jhatcher1 commented 1 month ago

Hello! I'm a new user of sling, thanks for the project!

Issue Description

$ sling run --src-conn TRINO --src-stream "myschema.mytable" --tgt-object "file:///tmp/mytable.parquet" -d
2024-05-16 13:55:17 DBG Sling version: 1.2.9 (darwin arm64)
2024-05-16 13:55:17 DBG type is db-file
2024-05-16 13:55:17 DBG using source options: {"empty_as_null":false,"null_if":"NULL","datetime_format":"AUTO","max_decimals":-1}
2024-05-16 13:55:17 DBG using target options: {"header":true,"compression":"AUTO","concurrency":7,"datetime_format":"auto","delimiter":",","max_decimals":-1,"use_bulk":true,"add_new_columns":true,"column_casing":"source"}
2024-05-16 13:55:17 INF connecting to source database (trino)
2024-05-16 13:55:17 DBG opened "trino" connection (conn-trino-CaP)
2024-05-16 13:55:17 INF reading from source database
2024-05-16 13:55:17 DBG select * from "myschema"."mytable"
2024-05-16 13:55:17 INF writing to target file system (file)
2024-05-16 13:55:17 DBG writing to file:///tmp/mytable.parquet [fileRowLimit=0 fileBytesLimit=0 compression=AUTO concurrency=7 useBufferedStream=false fileFormat=parquet]
runtime: bad pointer in frame github.com/flarco/g/csv.(*Writer).Write at 0x14001931e80: 0x1
fatal error: invalid pointer found on stack

...

Same stack trace as above
flarco commented 1 month ago

Thanks for reporting, those are hard to debug because I can't reproduce. If you run with env var SLING_PROCESS_BW=false, I think it should work. It's erroring when sling try to determine how many bytes have been written (for some specific row).

flarco commented 1 month ago

Closing this. Feel free to re-open.

jhatcher1 commented 1 month ago

Running sling with SLING_PROCESS_BW=false worked for me. If I can figure out how to reproduce the issue, I'll reopen this with more info. Thanks!