os-climate / os_c_data_commons

Repository for Data Commons platform architecture overview, as well as developer and user documentation
Apache License 2.0
20 stars 10 forks source link

SQL Error when querying any table under osc_datacommons_dev and osc_datacommons_prod #104

Closed caldeirav closed 2 years ago

caldeirav commented 2 years ago

In CloudBeaver:

SELECT * FROM osc_datacommons_dev.wri_test.gppd_new

returns:

SQL Error [16777235]: Query failed (#20211126_093210_00069_wtvpf): Unable to create input format org.apache.hadoop.mapred.FileInputFormat

@erikerlandson @MichaelTiemannOSC

MichaelTiemannOSC commented 2 years ago

I did some back-end cleanup of things that I was pretty sure I created. I don't remember, but it's poossible I did something that made wri_test go away. In which case--my apologies! But I think I was careful to clean up only my own sandboxes that were obsolete. There is no trino wri_test directory, only wri_gppd, wri_gppd_md and wri.

caldeirav commented 2 years ago

Re-created the table structure from scratch and without any DROP TABLE it seems I am not getting the problem any more. So will close this issue as we change the code to avoid this situation.

Note: I am currently using the following for my testing:

catalogname = 'osc_datacommons_iceberg_dev' schemaname = 'wri_new' tablename = 'gppd' meta_schema_name = 'metastore_iceberg' meta_table_name_dataset = 'meta_tables_iceberg' meta_table_name_fields = 'meta_fields_iceberg'

The meta store can be shared but the structure should remain as it is. Note that in my structure i use the dataset_key as the table name (so I want table names to be unique across the whole data store in order to avoid issues) - therefore it's used to link table and field level dataset level info and once linked, i can use the join for metadata display in the front-end layer.