cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.84k stars 1.77k forks source link

Importing from Redshift to cubestore throws error "Internal: panic" & "Unsupported value". #3517

Closed xylesoft closed 2 years ago

xylesoft commented 3 years ago

Describe the bug Importing data from Reshift into the cubestore always throws internal panic and various Unsupported value errors. Does not seem to be limited to a single data type, I've recorded:

in a Slack discussion, @paveltiunov suggested the issue possibly relates to a Redshift driver bug.

To Reproduce Steps to reproduce the behavior:

  1. Build a transformation in Redshift.
  2. Create Schema for said transformation.

    cube(`RsMerchantDetails`, {
    title: 'Redshift Merchant Details',
    
    sqlAlias: 'rmd',
    
    sql: `SELECT * FROM <database>.rs_merchant_details`,
    
    preAggregations: {
        main: {
            type: `rollup`,
            dimensions: [
                CUBE.programActive,
                CUBE.programId,
                CUBE.networkType,
                CUBE.countryCode,
                CUBE.urlPattern,
                CUBE.merchantId,
                CUBE.currencyId,
                CUBE.merchantName,
                CUBE.merchantHomepage,
                CUBE.createdDate,
                CUBE.merchantFeaturedRateId,
                CUBE.beginningDate,
                CUBE.expiredDate,
                CUBE.exclusiveRate,
                CUBE.featuredRate,
            ],
            indexes: {
                merchantIndex: {
                    columns: [CUBE.programActive, CUBE.merchantId]
                },
            },
            measures: [],
            external: true,
        }
    },
    
    measures: {
    },
    
    dimensions: {
        programActive: {
            sql: `program_active`,
            type: `string`,
        },
        programId: {
            sql: `program_id`,
            type: `number`,
        },
        networkType: {
            sql: `network_type`,
            type: `string`,
        },
        countryCode: {
            sql: `country_code`,
            type: `string`,
        },
        urlPattern: {
            sql: `url_pattern`,
            type: `string`,
        },
        merchantId: {
            sql: `merchant_id`,
            type: `number`,
        },
        currencyId: {
            sql: `currency_id`,
            type: `number`,
        },
        merchantName: {
            sql: `merchant_name`,
            type: `string`,
        },
        merchantHomepage: {
            sql: `merchant_homepage`,
            type: `string`,
        },
        createdDate: {
            sql: `created_date`,
            type: `time`,
        },
        merchantFeaturedRateId: {
            sql: `merchant_featured_rate_id`,
            type: `number`,
        },
        beginningDate: {
            sql: `beginning_date`,
            type: `time`,
        },
        expiredDate: {
            sql: `expired_date`,
            type: `time`,
        },
        exclusiveRate: {
            sql: `exclusive_rate`,
            type: `number`,
        },
        featuredRate: {
            sql: `featured_rate`,
            type: `string`,
        },
    },
    
    dataSource: `default`
    });
  3. Execute pre-aggregation (or wait for cube.js to run it).

Expected behavior To export Redshift tables into cubestore without failing.

Version:

Additional context

Cubestore log output:

cubestore_worker_1_1     | thread 'tokio-runtime-worker' panicked at 'Unsupported value: String("no")', cubestore/src/table/parquet.rs:668:42
cubestore_worker_1_1     | stack backtrace:
cubestore_worker_1_1     |    0:     0x55b0b4f375a0 - std::backtrace_rs::backtrace::libunwind::trace::h4e80f45db05e9df0
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/../../backtrace/src/backtrace/libunwind.rs:90:5
cubestore_worker_1_1     |    1:     0x55b0b4f375a0 - std::backtrace_rs::backtrace::trace_unsynchronized::h9db62a3d3587f6c3
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
cubestore_worker_1_1     |    2:     0x55b0b4f375a0 - std::sys_common::backtrace::_print_fmt::h2fafbe52c9090591
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/sys_common/backtrace.rs:67:5
cubestore_worker_1_1     |    3:     0x55b0b4f375a0 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h7d41ae04358c516d
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/sys_common/backtrace.rs:46:22
cubestore_worker_1_1     |    4:     0x55b0b4f5fbbc - core::fmt::write::h968e6400cbc51d6a
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/core/src/fmt/mod.rs:1112:17
cubestore_worker_1_1     |    5:     0x55b0b4f30315 - std::io::Write::write_fmt::h2eaa97c6f8c9e829
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/io/mod.rs:1642:15
cubestore_worker_1_1     |    6:     0x55b0b4f39d8b - std::sys_common::backtrace::_print::h601a6bc581f240d5
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/sys_common/backtrace.rs:49:5
cubestore_worker_1_1     |    7:     0x55b0b4f39d8b - std::sys_common::backtrace::print::h8b96e15ba64b8e64
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/sys_common/backtrace.rs:36:9
cubestore_worker_1_1     |    8:     0x55b0b4f39d8b - std::panicking::default_hook::{{closure}}::h94fa9d2f08c20620
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/panicking.rs:208:50
cubestore_worker_1_1     |    9:     0x55b0b4f39861 - std::panicking::default_hook::h6d463f5a2e9d9986
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/panicking.rs:225:9
cubestore_worker_1_1     |   10:     0x55b0b4f3a454 - std::panicking::rust_panic_with_hook::h1e953652a338573e
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/panicking.rs:622:17
cubestore_worker_1_1     |   11:     0x55b0b4f39f37 - std::panicking::begin_panic_handler::{{closure}}::h78bf2fec525c238a
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/panicking.rs:519:13
cubestore_worker_1_1     |   12:     0x55b0b4f37a9c - std::sys_common::backtrace::__rust_end_short_backtrace::h651eb6282e4ea6b2
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/sys_common/backtrace.rs:141:18
cubestore_worker_1_1     |   13:     0x55b0b4f39e99 - rust_begin_unwind
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/panicking.rs:515:5
cubestore_worker_1_1     |   14:     0x55b0b33e210b - std::panicking::begin_panic_fmt::hce1b46bb3d49d05a
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/panicking.rs:457:5
cubestore_worker_1_1     |   15:     0x55b0b39b10b2 - <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold::h281e9028322f10c3
cubestore_worker_1_1     |   16:     0x55b0b388b66a - <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter::hefcf4f51e755b190
cubestore_worker_1_1     |   17:     0x55b0b3b8e9d0 - cubestore::table::parquet::RowParquetWriter::write_buffer::h9a7cc3e1c5e68407
cubestore_worker_1_1     |   18:     0x55b0b3b8cfd8 - cubestore::table::parquet::SplitRowParquetWriter::write_rows::h9d29d155d1c79fdd
cubestore_worker_1_1     |   19:     0x55b0b3b87ad0 - <cubestore::table::parquet::ParquetTableStore as cubestore::table::TableStore>::merge_rows::h3852c7e8d4f55ba7
cubestore_worker_1_1     |   20:     0x55b0b3a35bdb - <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll::h3c3c69c0513692d3
cubestore_worker_1_1     |   21:     0x55b0b376a737 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::hca4ade8552fe2e42
cubestore_worker_1_1     |   22:     0x55b0b3c29462 - tokio::runtime::task::harness::Harness<T,S>::poll::h1c23498341cc8d29
cubestore_worker_1_1     |   23:     0x55b0b48ec7f9 - tokio::runtime::blocking::pool::Inner::run::h72d5c84972262512
cubestore_worker_1_1     |   24:     0x55b0b48e2f94 - std::sys_common::backtrace::__rust_begin_short_backtrace::hc4157b0d1cb360e1
cubestore_worker_1_1     |   25:     0x55b0b49045ad - core::ops::function::FnOnce::call_once{{vtable.shim}}::hf4fca979670f7824
cubestore_worker_1_1     |   26:     0x55b0b4f413d7 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h412fa2ab3f64d2a1
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/alloc/src/boxed.rs:1575:9
cubestore_worker_1_1     |   27:     0x55b0b4f413d7 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h03e575a07095ff99
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/alloc/src/boxed.rs:1575:9
cubestore_worker_1_1     |   28:     0x55b0b4f413d7 - std::sys::unix::thread::Thread::new::thread_start::hf6d2591b9ad70cfb
cubestore_worker_1_1     |                                at /rustc/71b8742bbcbed2cd908dbc031d6552d8b528c037/library/std/src/sys/unix/thread.rs:72:17
cubestore_worker_1_1     |   29:     0x7fe26e06ffa3 - start_thread
cubestore_worker_1_1     |   30:     0x7fe26de184cf - clone
cubestore_worker_1_1     |   31:                0x0 - <unknown>
cubestore_router_1       | 2021-09-13 10:57:49,062 ERROR [cubestore::http] <pid:1> Error processing HTTP command: User: Create table failed: Internal: panic
cubestore_router_1       |
cubestore_worker_1_1     | 2021-09-13 10:57:49,062 ERROR [cubestore::cluster] <pid:1> Running job error (3.6895535s): IdRow { id: 90, row: Job { row_reference: Table(Tables, 5), job_type: TableImportCSV("https://xxxx-redshift-cube.s3.eu-central-1.amazonaws.com/XXXXXX/0002_part_00.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=XXXXXXXXX%2F20210913%2Feu-central-1%2Fs3%2Faws4_request&X-Amz-Date=20210913T105745Z&X-Amz-Expires=3600&X-Amz-Signature=9f432f25cb5568937a7d48bb3499e04bc1705a7184108bbf4c027e7dab73ca9d&X-Amz-SignedHeaders=host&x-id=GetObject"), last_heart_beat: 2021-09-13T10:57:49.061964422Z, status: Error("Internal: panic") } }
cube_refresh_worker_1_1  | {"message":"Error while querying","processingId":"44","queueSize":1,"duration":13963,"queryKey":[["CREATE TABLE cubetemp.dpc_main_blocked_projects AS SELECT\n      \"dpc\".country_code \"dpc__country_code\", \"dpmfr\".beginning_date \"dpmfr__beginning_date\", \"dpmfr\".exclusive_rate \"dpmfr__exclusive_rate\", \"dpmfr\".expired_date \"dpmfr__expired_date\", \"dpmfr\".featured_rate \"dpmfr__featured_rate\", \"dpmfr\".merchant_featured_rate_id \"dpmfr__merchant_featured_rate_id\", \"dpmup\".pattern \"dpmup__pattern\", \"dpm\".merchant_homepage \"dpm__merchant_homepage\", \"dpm\".merchant_id \"dpm__merchant_id\", \"dpm\".merchant_name \"dpm__merchant_name\", \"dpn\".type \"dpn__network_type\", \"dpp\".active \"dpp__active\", \"dpp\".program_id \"dpp__program_id\", \"pbm\".project_id \"pbm__project_id\"\n    FROM\n      publisher.dp_merchant_url_patterns AS \"dpmup\"\nLEFT JOIN publisher.dp_countries AS \"dpc\" ON \"dpmup\".country_id = \"dpc\".country_id\nLEFT JOIN publisher.dp_merchants AS \"dpm\" ON \"dpmup\".merchant_id = \"dpm\".merchant_id\nLEFT JOIN publisher.dp_merchant_featured_rates AS \"dpmfr\" ON \"dpm\".merchant_id = \"dpmfr\".merchant_id\nLEFT JOIN publisher.dp_programs AS \"dpp\" ON \"dpm\".merchant_id = \"dpp\".merchant_id\nLEFT JOIN publisher.dp_project_blocked_merchants AS \"pbm\" ON \"dpm\".merchant_id = \"pbm\".merchant_id\nLEFT JOIN publisher.dp_networks AS \"dpn\" ON \"dpp\".network_id = \"dpn\".network_id  GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14",[]],[[{"refresh_key":"151067"}]]],"queuePrefix":"SQL_PRE_AGGREGATIONS_STANDALONE_default","requestId":"scheduler-b98bd105-dbc7-4604-96f2-f36d75205594","timeInQueue":1,"preAggregationId":"DpCountries.main_blocked_projects","newVersionEntry":{"table_name":"cubetemp.dpc_main_blocked_projects","structure_version":"hc2uqtdo","content_version":"fgzabjgr","last_updated_at":1631530655098,"naming_version":2},"error":"Error: Error during create table: CREATE TABLE cubetemp.dpc_main_blocked_projects_fgzabjgr_hc2uqtdo_1gjubkv (`pbm__project_id` bigint, `dpp__program_id` bigint, `dpm__merchant_id` bigint, `dpmfr__merchant_featured_rate_id` bigint, `dpmfr__expired_date` bigint, `dpmfr__beginning_date` bigint, `dpp__active` varchar(255), `dpn__network_type` varchar(255), `dpm__merchant_name` varchar(255), `dpm__merchant_homepage` varchar(255), `dpmup__pattern` varchar(255), `dpmfr__featured_rate` varchar(255), `dpc__country_code` varchar(255), `dpmfr__exclusive_rate` smallint)  LOCATION ?, ?, ?, ?: User: Create table failed: Internal: panic\n    at WebSocket.<anonymous> (/cube/node_modules/@cubejs-backend/cubestore-driver/src/WebSocketConnection.ts:87:30)\n    at WebSocket.emit (events.js:314:20)\n    at Receiver.receiverOnMessage (/cube/node_modules/ws/lib/websocket.js:978:20)\n    at Receiver.emit (events.js:314:20)\n    at Receiver.dataMessage (/cube/node_modules/ws/lib/receiver.js:502:14)\n    at Receiver.getData (/cube/node_modules/ws/lib/receiver.js:435:17)\n    at Receiver.startLoop (/cube/node_modules/ws/lib/receiver.js:143:22)\n    at Receiver._write (/cube/node_modules/ws/lib/receiver.js:78:10)\n    at doWrite (_stream_writable.js:403:12)\n    at writeOrBuffer (_stream_writable.js:387:5)\n    at Receiver.Writable.write (_stream_writable.js:318:11)\n    at Socket.socketOnData (/cube/node_modules/ws/lib/websocket.js:1072:35)\n    at Socket.emit (events.js:314:20)\n    at addChunk (_stream_readable.js:297:12)\n    at readableAddChunk (_stream_readable.js:272:9)\n    at Socket.Readable.push (_stream_readable.js:213:10)\n    at TCP.onStreamRead (internal/stream_base_commons.js:188:23)"}

Router's MySQL (The entry in the following response is a different aggregation, the merchant detail one never appears)

$ mysql -h 172.19.0.5 --user=cubestore -pcubestore
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 5.1.10-alpha-msql-proxy
Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show tables\G
*************************** 1. row ***************************
           id: 1
   table_name: dpcl_url_project_clicks20210901_2owbwgnw_srtg0pif_1gjuaor
    schema_id: 1
      columns: [{"name":"dpcl__click_date_day","column_type":"Timestamp","column_index":0},{"name":"dpcl__click_count","column_type":"Int","column_index":1},{"name":"dpcd__url_original_id","column_type":"Int","column_index":2},{"name":"dpcl__click_date","column_type":"String","column_index":3},{"name":"dpcl__project_id","column_type":"Int","column_index":4},{"name":"dpcl__exclude_urls","column_type":"Boolean","column_index":5},{"name":"dpcl__exclude_networks","column_type":"Boolean","column_index":6}]
    locations: ["https://xxxx-redshift-cube.s3.eu-central-1.amazonaws.com/XXXXXXXXXX/0000_part_00.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=XXXXXXXXX%2F20210913%2Feu-central-1%2Fs3%2Faws4_request&X-Amz-Date=20210913T104602Z&X-Amz-Expires=3600&X-Amz-Signature=XXXXXXXXXXXXX&X-Amz-SignedHeaders=host&x-id=GetObject", "https://xxxx-redshift-cube.s3.eu-central-1.amazonaws.com/9b6ce14b3d6cbb9befdf/0001_part_00.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=XXXXXXXXX%2F20210913%2Feu-central-1%2Fs3%2Faws4_request&X-Amz-Date=20210913T104602Z&X-Amz-Expires=3600&X-Amz-Signature=a0316c4c0ef3543f93f7193f5d10aeff341e6d36bcb58e389a3f62159e172346&X-Amz-SignedHeaders=host&x-id=GetObject", "https://xxxx-redshift-cube.s3.eu-central-1.amazonaws.com/9b6ce14b3d6cbb9befdf/0002_part_00.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=XXXXXXXXX%2F20210913%2Feu-central-1%2Fs3%2Faws4_request&X-Amz-Date=20210913T104602Z&X-Amz-Expires=3600&X-Amz-Signature=a8567bf171aefc40d9939c318205eb6527797d5818967ef670a309dcb3c8024d&X-Amz-SignedHeaders=host&x-id=GetObject", "https://xxxx-redshift-cube.s3.eu-central-1.amazonaws.com/9b6ce14b3d6cbb9befdf/0003_part_00.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=XXXXXXXXX%2F20210913%2Feu-central-1%2Fs3%2Faws4_request&X-Amz-Date=20210913T104602Z&X-Amz-Expires=3600&X-Amz-Signature=f474dd19f040f94556b10b5e7f360abd010250d08b46bbb0275dcb20b635e88a&X-Amz-SignedHeaders=host&x-id=GetObject"]
import_format: CSV
     has_data: true
     is_ready: true
   created_at: 2021-09-13 10:46:02.925633473 UTC
1 row in set (0.00 sec)
paveltiunov commented 2 years ago

Fixed in the latest version