starburstdata / metabase-driver

Starburst Metabase driver
Apache License 2.0
63 stars 10 forks source link

Driver fails with java.lang.NullPointerException on timestamp field with nulls #82

Closed RomantsovArtur closed 1 year ago

RomantsovArtur commented 1 year ago

Hi, thank you for providing the driver to the community

I was trying it with Metabase v0.45.3 with Trino version 407 and I'm getting NullPointer on querying table with asterisk

{:type "native", :native {:query "select * from iceberg.raw.model_events\nwhere model_id = '556e8534-c549-4af8-843a-81bc9bc67802'\nand created_date >= date '2022-04-01'\nand created_date < date '2022-04-03'", :template-tags {}}, :database 33, :parameters [], :middleware {:js-int-to-string? true, :add-default-userland-constraints? true}}, :status :failed, :class java.lang.NullPointerException, :stacktrace ["clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:97)" "--> driver.implementation.execute$fn__81803$fn__81804.invoke(execute.clj:128)" "driver.sql_jdbc.execute$row_thunk$row_thunk_STAR___56080.invoke(execute.clj:456)" "query_processor.reducible$reducible_rows$reify__45528.reduce(reducible.clj:149)" "query_processor.context.default$default_reducef$fn__39686.invoke(default.clj:57)" "query_processor.context.default$default_reducef.invokeStatic(default.clj:56)" "query_processor.context.default$default_reducef.invoke(default.clj:48)" "query_processor.context$reducef.invokeStatic(context.clj:69)" "query_processor.context$reducef.invoke(context.clj:62)" "query_processor.context.default$default_runf$respond_STAR___39690.invoke(default.clj:68)" "driver.sql_jdbc.execute$execute_reducible_query.invokeStatic(execute.clj:506)" "driver.sql_jdbc.execute$execute_reducible_query.invoke(execute.clj:485)" "driver.sql_jdbc.execute$execute_reducible_query.invokeStatic(execute.clj:493)" "driver.sql_jdbc.execute$execute_reducible_query.invoke(execute.clj:485)" "driver.sql_jdbc$fn__88601.invokeStatic(sql_jdbc.clj:58)" "driver.sql_jdbc$fn__88601.invoke(sql_jdbc.clj:56)" "query_processor.context$executef.invokeStatic(context.clj:59)" "query_processor.context$executef.invoke(context.clj:48)" "query_processor.context.default$default_runf.invokeStatic(default.clj:67)" "query_processor.context.default$default_runf.invoke(default.clj:65)" "query_processor.context$runf.invokeStatic(context.clj:45)" "query_processor.context$runf.invoke(context.clj:39)" "query_processor.reducible$identity_qp.invokeStatic(reducible.clj:12)" "query_processor.reducible$identity_qp.invoke(reducible.clj:9)" "query_processor.middleware.cache$maybe_return_cached_results$maybe_return_cached_results_STAR___53600.invoke(cache.clj:220)" "query_processor.middleware.permissions$check_query_permissions$fn__49259.invoke(permissions.clj:109)" "query_processor.middleware.mbql_to_native$mbql__GT_native$fn__52545.invoke(mbql_to_native.clj:23)" "query_processor$fn__55076$combined_post_process__55081$combined_post_process_STAR___55082.invoke(query_processor.clj:212)" "query_processor$fn__55076$combined_pre_process__55077$combined_pre_process_STAR___55078.invoke(query_processor.clj:209)" "query_processor.middleware.resolve_database_and_driver$resolve_database_and_driver$fn__53521$fn__53526.invoke(resolve_database_and_driver.clj:35)" "driver$do_with_driver.invokeStatic(driver.clj:76)" "driver$do_with_driver.invoke(driver.clj:72)" "query_processor.middleware.resolve_database_and_driver$resolve_database_and_driver$fn__53521.invoke(resolve_database_and_driver.clj:34)" "query_processor.middleware.fetch_source_query$resolve_card_id_source_tables$fn__49525.invoke(fetch_source_query.clj:314)" "query_processor.middleware.store$initialize_store$fn__49715$fn__49716.invoke(store.clj:11)" "query_processor.store$do_with_store.invokeStatic(store.clj:45)" "query_processor.store$do_with_store.invoke(store.clj:39)" "query_processor.middleware.store$initialize_store$fn__49715.invoke(store.clj:10)" "query_processor.middleware.normalize_query$normalize$fn__53793.invoke(normalize_query.clj:22)" "query_processor.middleware.constraints$add_default_userland_constraints$fn__50803.invoke(constraints.clj:53)" "query_processor.middleware.process_userland_query$process_userland_query$fn__53732.invoke(process_userland_query.clj:145)" "query_processor.middleware.catch_exceptions$catch_exceptions$fn__54104.invoke(catch_exceptions.clj:167)" "query_processor.reducible$async_qp$qp_STAR___45514$thunk__45516.invoke(reducible.clj:100)" "query_processor.reducible$async_qp$qp_STAR___45514.invoke(reducible.clj:106)" "query_processor.reducible$async_qp$qp_STAR___45514.invoke(reducible.clj:91)" "query_processor.reducible$sync_qp$qp_STAR___45525.doInvoke(reducible.clj:126)" "query_processor$process_userland_query.invokeStatic(query_processor.clj:331)" "query_processor$process_userland_query.doInvoke(query_processor.clj:327)" "query_processor$fn__55123$process_query_and_save_execution_BANG___55132$fn__55135.invoke(query_processor.clj:342)" "query_processor$fn__55123$process_query_and_save_execution_BANG___55132.invoke(query_processor.clj:335)" "query_processor$fn__55167$process_query_and_save_with_max_results_constraints_BANG___55176$fn__55179.invoke(query_processor.clj:354)" "query_processor$fn__55167$process_query_and_save_with_max_results_constraints_BANG___55176.invoke(query_processor.clj:347)" "api.dataset$run_query_async$fn__70529.invoke(dataset.clj:69)" "query_processor.streaming$streaming_response_STAR_$fn__40610$fn__40611.invoke(streaming.clj:162)" "query_processor.streaming$streaming_response_STAR_$fn__40610.invoke(streaming.clj:161)" "async.streaming_response$do_f_STAR_.invokeStatic(streaming_response.clj:65)" "async.streaming_response$do_f_STAR_.invoke(streaming_response.clj:63)" "async.streaming_response$do_f_async$task__20021.invoke(streaming_response.clj:84)"], :card_id nil, :context :ad-hoc, :error nil, :row_count 0, :running_time 0, :data {:rows [], :cols []}}

It works well if columns are specified

lpoulain commented 1 year ago

Hi, is this created through a SQL question or through another way? I am not able to reproduce the issue on my end.

RomantsovArtur commented 1 year ago

image

image

image

RomantsovArtur commented 1 year ago

New sql question. An interesting point that it succeed on the cluster, but driver fails to fetch the result for some reason and fails with NPE

lpoulain commented 1 year ago

Could it be that one particular field is making the connector choke, rather than the SELECT * per say?

RomantsovArtur commented 1 year ago

Let's see.

Table configs:

   etl_job_id timestamp(6) with time zone,
   id varchar COMMENT '',
   action_event_id varchar,
   created_date timestamp(6) with time zone,
   model_type varchar,
   model_id varchar,
   event_type varchar,
   origin_date timestamp(6) with time zone,
   model_data varchar,
   new_users_state varchar,
   kyc_status varchar,
   new_country varchar,
   new_city varchar,
   new_region varchar,
   new_postcode varchar,
   new_streetline1 varchar,
   new_streetline2 varchar,
   users_email varchar,
   tickets_assignee varchar,
   tickets_previous_step varchar,
   tickets_new_step varchar,
   new_company_id varchar,
   old_company_id varchar,
   old_country varchar,
   old_city varchar,
   old_region varchar,
   old_postcode varchar,
   old_streetline1 varchar,
   old_streetline2 varchar,
   amount double,
   old_state varchar,
   new_restriction_level varchar,
   old_restriction_level varchar,
   reason varchar,
   new_firstname varchar,
   new_lastname varchar,
   old_firstname varchar,
   old_lastname varchar,
   data_id varchar,
   event_scope varchar,
   verification_type varchar,
   status varchar,
   proof_type varchar,
   proof_state varchar,
   reason_code varchar,
   reason_description varchar,
   birth_date date,
   event_offset bigint,
   new_status varchar,
   old_status varchar,
   request_type varchar,
   request_state varchar,
   business_info_url varchar,
   when_date timestamp(6) with time zone,
   owner_id varchar,
   edd_portal_queue varchar,
   user_settings_changed_category varchar,
   user_settings_changed_category_push_state varchar,
   user_settings_changed_category_email_state varchar,
   pocket_id varchar,
   module_name varchar,
   bank_country varchar,
   branch_code varchar,
   account_number varchar,
   name varchar,
   passed_type varchar,
   gambling_block_enabled varchar

works:

select etl_job_id, id, action_event_id, created_date, model_type, model_id, event_type, origin_date, model_data, new_users_state, kyc_status, new_country, new_city, new_region, new_postcode, new_streetline1, new_streetline2, users_email, tickets_assignee, tickets_previous_step, tickets_new_step, new_company_id, old_company_id, old_country, old_city, old_region, old_postcode, old_streetline1, old_streetline2, amount, old_state, new_restriction_level, old_restriction_level, reason, new_firstname, new_lastname, old_firstname, old_lastname, data_id, event_scope, verification_type, status, proof_type, proof_state, reason_code, reason_description, birth_date, event_offset, new_status, old_status, request_type, request_state, business_info_url from iceberg.raw.model_events limit 10

If I add one more field like when_date (next in the list), it fails

select etl_job_id, id, action_event_id, created_date, model_type, model_id, event_type, origin_date, model_data, new_users_state, kyc_status, new_country, new_city, new_region, new_postcode, new_streetline1, new_streetline2, users_email, tickets_assignee, tickets_previous_step, tickets_new_step, new_company_id, old_company_id, old_country, old_city, old_region, old_postcode, old_streetline1, old_streetline2, amount, old_state, new_restriction_level, old_restriction_level, reason, new_firstname, new_lastname, old_firstname, old_lastname, data_id, event_scope, verification_type, status, proof_type, proof_state, reason_code, reason_description, birth_date, event_offset, new_status, old_status, request_type, request_state, business_info_url, when_date from iceberg.raw.model_events limit 10

yeah, it fails on when_date 🤔 select when_date from iceberg.raw.model_events limit 10 But it can query created_date, which is the same type without any issue 🤔

lpoulain commented 1 year ago

Could it be that the when_date field has a null value for some rows but not the created_date?

RomantsovArtur commented 1 year ago

image

RomantsovArtur commented 1 year ago

yeah, I can see another stacktrace, yep it does have null values

lpoulain commented 1 year ago

I verified, and this bug should have been fixed in the latest version of our connector (2.0.0). Could you try this version?

RomantsovArtur commented 1 year ago

image

image

I'm afraid I'm using the lastest version

lpoulain commented 1 year ago

Ah, I was able to reproduce it. A null date doesn't seem to be an issue, but a null timestamp causes a crash. Looking into it.

RomantsovArtur commented 1 year ago

Amazing! Thank you a lot for the help 🚀

lpoulain commented 1 year ago

The connector version 2.0.1 has just been built, and I verified it works with a null timestamp with time zone (regular timestamps worked properly before)

RomantsovArtur commented 1 year ago

I've tried and it solved the issue on my side too! Thanks for the fix 🍾