thoughtspot / cs_tools

Scale your ThoughtSpot adoption with tools created by the ThoughtSpot Solutions Consulting organization.
https://thoughtspot.github.io/cs_tools/
Other
10 stars 6 forks source link

Archiver Expects The `archiver_report` Table To Exist. #81

Closed jonathandurantalkiatry closed 1 year ago

jonathandurantalkiatry commented 1 year ago

First Stop

Platform Configuration

cs-tools-info-2023-11-14

Description

When running the archiver tool using the snowflake syncer, I get a KeyError due to the archiver_report table not existing in the schema I've provided.

I took a quick look at the cs_tools/sync/snowflake/syncer.py code and noticed that self.metadata = sa.MetaData(schema=self.schema_) (line 78) is used to access the archiver_report table in the dump method t = self.metadata.tables[f"{self.schema_}.{table}"] (line 128).

I may be wrong, but it looks like there is an expectation of the archiver_report table to already exist in the metadata object. If that is the case, is there another command I should be running to create these tables ahead of executing the identify command?

LOG FILE [DEBUG - 2023-11-14 12:30:10,135] [cs_tools.api._client - _client.request 50] >> POST to V1: callosum/v1/tspublic/v1/session/login with keywords {'data': {'username': 'local_admin'}} [DEBUG - 2023-11-14 12:30:10,609] [cs_tools.api._client - _client.request 76] << HTTP: 204 [DEBUG - 2023-11-14 12:30:10,609] [cs_tools.api._client - _client.request 50] >> GET to V1: callosum/v1/tspublic/v1/session/info with keywords {} [DEBUG - 2023-11-14 12:30:10,792] [cs_tools.api._client - _client.request 76] << HTTP: 200 [DEBUG - 2023-11-14 12:30:10,817] [cs_tools.thoughtspot - thoughtspot.login 182] execution context...

    [CS TOOLS COMMAND]
    cs_tools tools archiver identify --syncer snowflake:///Users/REDACTED/snowflake.toml --dry-run

    [PLATFORM DETAILS]
    system: Darwin (detail: macOS-14.0-arm64-arm-64bit)
    python: 3.9.6
    ran at: 2023-11-14 12:30:10-0500
    cs_tools: v1.4.13

    [THOUGHTSPOT]
    cluster id: b87f1a9f-38e0-11ec-aba0-6d838
    cluster: talkiatry
    url: https://talkiatry.thoughtspot.cloud
    timezone: Timezone('UTC')
    branch: cloud
    version: 9.7.0

    [LOGGED IN USER]
    user_id: b89c7d08-4c5a-4863-ae3d-abe998094f4b
    username: REDACTED
    display_name: REDACTED
    privileges: ['ADMINISTRATION', 'AUTHORING', 'USERDATAUPLOADING', 'DATADOWNLOADING', 'DATAMANAGEMENT', 'SHAREWITHALL', 'JOBSCHEDULING', 'A3ANALYSIS', 'EXPERIMENTALFEATUREPRIVILEGE', 'DEVELOPER']

[INFO - 2023-11-14 12:30:10,818] [cs_tools.cli.dependencies.syncer - syncer.enter 49] registering syncer: snowflake [DEBUG - 2023-11-14 12:30:10,827] [cs_tools.sync.register - register.load_syncer 95] manifest digest:

{'name': 'snowflake', 'syncer_class': 'Snowflake', 'requirements': ['setuptools', 'wheel', 'pyarrow == 10.0.1', 'cryptography == 40.0.2', 'snowflake-sqlalchemy == 1.4.6'], 'pip_args': [[], [], [], [], ['--no-use-pep517']]}

[DEBUG - 2023-11-14 12:30:10,827] [cs_tools.sync.register - register.ensure_dependencies 44] processing requirement: setuptools [DEBUG - 2023-11-14 12:30:10,828] [cs_tools.sync.register - register.ensure_dependencies 48] requirement satisfied, no install necessary [DEBUG - 2023-11-14 12:30:10,828] [cs_tools.sync.register - register.ensure_dependencies 44] processing requirement: wheel [DEBUG - 2023-11-14 12:30:10,829] [cs_tools.sync.register - register.ensure_dependencies 48] requirement satisfied, no install necessary [DEBUG - 2023-11-14 12:30:10,829] [cs_tools.sync.register - register.ensure_dependencies 44] processing requirement: pyarrow == 10.0.1 [DEBUG - 2023-11-14 12:30:10,829] [cs_tools.sync.register - register.ensure_dependencies 48] requirement satisfied, no install necessary [DEBUG - 2023-11-14 12:30:10,830] [cs_tools.sync.register - register.ensure_dependencies 44] processing requirement: cryptography == 40.0.2 [DEBUG - 2023-11-14 12:30:10,830] [cs_tools.sync.register - register.ensure_dependencies 48] requirement satisfied, no install necessary [DEBUG - 2023-11-14 12:30:10,830] [cs_tools.sync.register - register.ensure_dependencies 44] processing requirement: snowflake-sqlalchemy == 1.4.6 [DEBUG - 2023-11-14 12:30:10,831] [cs_tools.sync.register - register.ensure_dependencies 48] requirement satisfied, no install necessary [DEBUG - 2023-11-14 12:30:11,042] [snowflake.connector.ssl_wrap_socket - ssl_wrap_socket.inject_into_urllib3 44] Injecting ssl_wrap_socket_with_ocsp [DEBUG - 2023-11-14 12:30:11,042] [snowflake.connector.auth._auth - _auth. 91] cache directory: /REDACTED/Library/Caches/Snowflake [DEBUG - 2023-11-14 12:30:11,058] [snowflake.connector.cursor - cursor. 87] Failed to import pyarrow. Cannot use pandas fetch API [DEBUG - 2023-11-14 12:30:11,183] [cs_tools.cli.dependencies.syncer - syncer.enter 60] initializing syncer: <class 'cs_tools_snowflake_syncer.Snowflake'> [DEBUG - 2023-11-14 12:30:11,898] [cs_tools.api._client - _client.request 50] >> GET to V1: callosum/v1/tspublic/v1/metadata/list with keywords {'params': {'type': 'LOGICAL_TABLE', 'subtypes': '["WORKSHEET"]', 'category': 'ALL', 'sort': 'CREATED', 'sortascending': True, 'offset': -1, 'pattern': 'TS: BI Server', 'showhidden': False, 'auto_created': False}} [DEBUG - 2023-11-14 12:30:11,935] [cs_tools.api._client - _client.request 76] << HTTP: 200 [DEBUG - 2023-11-14 12:30:11,935] [cs_tools.api.middlewares.search - search.call 192] executing search on guid eaab6de7-c556-468c-8b4b-ff6d78dd3ecf

[user action] != [user action].answer_unsaved [user action].{null} [answer book guid] != [answer book guid].{null} [timestamp].'last 3650 days' [timestamp].'today' [answer book guid]

[DEBUG - 2023-11-14 12:30:11,936] [cs_tools.api._client - _client.request 50] >> POST to V1: callosum/v1/tspublic/v1/searchdata with keywords {'params': {'query_string': "[user action] != [user action].answer_unsaved [user action].{null} [answer book guid] != [answer book guid].{null} [timestamp].'last 3650 days' [timestamp].'today' [answer book guid]", 'data_source_guid': 'eaab6de7-c556-468c-8b4b-ff6d78dd3ecf', 'batchsize': -1, 'pagenumber': -1, 'offset': 0, 'formattpe': 'COMPACT'}} [DEBUG - 2023-11-14 12:30:13,045] [cs_tools.api._client - _client.request 76] << HTTP: 200 [DEBUG - 2023-11-14 12:30:13,057] [cs_tools.api._client - _client.request 50] >> GET to V1: callosum/v1/tspublic/v1/metadata/details with keywords {'params': {'type': 'LOGICAL_TABLE', 'id': '["eaab6de7-c556-468c-8b4b-ff6d78dd3ecf"]', 'showhidden': False, 'dropquestiondetails': False, 'version': -1}} [DEBUG - 2023-11-14 12:30:13,104] [cs_tools.api._client - _client.request 76] << HTTP: 200 [DEBUG - 2023-11-14 12:30:13,159] [cs_tools.api._client - _client.request 50] >> GET to V1: callosum/v1/tspublic/v1/metadata/list with keywords {'params': {'type': 'QUESTION_ANSWER_BOOK', 'category': <MetadataCategory.all: 'ALL'>, 'sort': 'DEFAULT', 'offset': 0, 'batchsize': 500, 'showhidden': False, 'auto_created': False}} [DEBUG - 2023-11-14 12:30:13,216] [cs_tools.api._client - _client.request 76] << HTTP: 200 [DEBUG - 2023-11-14 12:30:13,221] [cs_tools.api._client - _client.request 50] >> GET to V1: callosum/v1/tspublic/v1/metadata/list with keywords {'params': {'type': 'PINBOARD_ANSWER_BOOK', 'category': <MetadataCategory.all: 'ALL'>, 'sort': 'DEFAULT', 'offset': 0, 'batchsize': 500, 'showhidden': False, 'auto_created': False}} [DEBUG - 2023-11-14 12:30:13,271] [cs_tools.api._client - _client.request 76] << HTTP: 200 [DEBUG - 2023-11-14 12:30:13,377] [cs_tools.api._client - _client.request 50] >> POST to V1: callosum/v1/tspublic/v1/session/logout with keywords {} [DEBUG - 2023-11-14 12:30:13,412] [cs_tools.api._client - _client.request 76] << HTTP: 204 [DEBUG - 2023-11-14 12:30:13,426] [cs_tools.cli.main - main.run 169] whoopsie, something went wrong! Traceback (most recent call last): File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/cs_tools/cli/main.py", line 144, in run return_code = app(standalone_mode=False) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/typer/main.py", line 328, in call raise e File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/typer/main.py", line 311, in call return get_command(self)(*args, kwargs) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/typer/core.py", line 778, in main return _main( File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/typer/core.py", line 216, in _main rv = self.invoke(ctx) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/cs_tools/cli/ux.py", line 38, in invoke r = ctx.invoke(self.callback, ctx.params) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/click/core.py", line 783, in invoke return __callback(args, kwargs) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper return callback(**use_params) # type: ignore File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/cs_tools/cli/tools/archiver/app.py", line 216, in identify this_task.skip() File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/cs_tools/cli/layout.py", line 113, in exit raise exc File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/cs_tools/cli/tools/archiver/app.py", line 214, in identify syncer.dump("archiver_report", data=to_archive) File "/REDACTED/cs_tools/.cs_tools/lib/python3.9/site-packages/cstools/sync/snowflake/syncer.py", line 128, in dump t = self.metadata.tables[f"{self.schema}.{table}"] KeyError: 'REDACTED.archiver_report'

boonhapus commented 1 year ago

Thanks for the bit of discovery @jonathandurantalkiatry , that was very helpful!

I think this is because I don't register the model in the SQLAlchemy/sqlmodel metadata prior to trying to write to it, ala Searchable's setup.

Once this happen, the table should get created automagically when the syncer gets registered.

The two CLI dependency injection & Syncer interfaces are going to change here in our next version (v1.5.x.) before EOY in a fairly large housekeeping update. Would it OK if I provided you the DDL to create this table for now instead?

CREATE TABLE archiver_report (
        type VARCHAR NOT NULL,
        guid VARCHAR NOT NULL,
        modified DATETIME NOT NULL,
        author_guid VARCHAR,
        author VARCHAR
        name VARCHAR
        PRIMARY KEY (type, guid, modified)
)
jonathandurantalkiatry commented 1 year ago

Thanks for the bit of discovery @jonathandurantalkiatry , that was very helpful!

I think this is because I don't register the model in the SQLAlchemy/sqlmodel metadata prior to trying to write to it, ala Searchable's setup.

Once this happen, the table should get created automagically when the syncer gets registered.

The two CLI dependency injection & Syncer interfaces are going to change here in our next version (v1.5.x.) before EOY in a fairly large housekeeping update. Would it OK if I provided you the DDL to create this table for now instead?

CREATE TABLE archiver_report (
        type VARCHAR NOT NULL,
        guid VARCHAR NOT NULL,
        modified DATETIME NOT NULL,
        author_guid VARCHAR,
        author VARCHAR
        name VARCHAR
        PRIMARY KEY (type, guid, modified)
)

Sounds good to me. Thanks for the quick response. Will go ahead and create this table as suggested and will keep an eye out for the next version.

boonhapus commented 1 year ago

This should happen automagically in the next release, but it's a patch solution so others don't encounter it. Feel free to re-open if this doesn't sort you out.