SHOW CHANGE STREAMS statement

apstndb commented 2 years ago

refs https://github.com/cloudspannerecosystem/spanner-cli/issues/135#issuecomment-1150717216

Show watched tables and columns
Show FOR ALL or empty streams also
Show options(currently only retention_period)

Proposed layout

+----------------------------------------------------------------+----------------------+
| CHANGE_STREAM                                                  | OPTIONS              |
+----------------------------------------------------------------+----------------------+
| Empty                                                          |                      |
| EverythingStream FOR ALL                                       |                      |
| NamesAndAlbums FOR Singers(FirstName, LastName), Albums, Songs | retention_period=36h |
| SingerAlbumStream FOR Singers, Albums                          |                      |
+----------------------------------------------------------------+----------------------+

Semantics

SELECT CHANGE_STREAM_NAME || IF(`ALL`, " FOR ALL", IFNULL(" FOR " || (
  SELECT STRING_AGG(TABLE_NAME || IF(ALL_COLUMNS, "", FORMAT("(%s)", (
    SELECT STRING_AGG(COLUMN_NAME, ", ")
    FROM INFORMATION_SCHEMA.CHANGE_STREAM_COLUMNS CSC
    WHERE (CST.CHANGE_STREAM_NAME, CST.TABLE_NAME) = (CSC.CHANGE_STREAM_NAME, CSC.TABLE_NAME)
  ))), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_TABLES CST
  WHERE CS.CHANGE_STREAM_NAME = CST.CHANGE_STREAM_NAME
), "")) AS CHANGE_STREAM,
IFNULL((
  SELECT STRING_AGG(FORMAT("%s=%s", OPTION_NAME, OPTION_VALUE), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_OPTIONS CSO
  WHERE CS.CHANGE_STREAM_NAME = CSO.CHANGE_STREAM_NAME
), "") AS OPTIONS
FROM INFORMATION_SCHEMA.CHANGE_STREAMS CS

References

yfuruyama commented 2 years ago

Thanks for the suggested layout.

Another layout representation would be just using the response of the GetDatabaseDdlResponse. With this way we don't need to change our layout even if a new notation/syntax is added to the Change Stream in the future.

apstndb commented 2 years ago

Equivalent DDLs

$ gcloud spanner databases ddl describe --project=$PROJECT_ID --instance=$INSTANCE_ID $DATABASE_ID | pcregrep -M 'CREATE CHANGE STREAM [^;]*;'
CREATE CHANGE STREAM Empty;
CREATE CHANGE STREAM EverythingStream
  FOR ALL;
CREATE CHANGE STREAM NamesAndAlbums
  FOR Singers(FirstName, LastName), Albums, Songs OPTIONS (
  retention_period = '36h'
);
CREATE CHANGE STREAM SingerAlbumStream
  FOR Singers, Albums;

yfuruyama commented 2 years ago

Thanks! That's the one. What do you think?

apstndb commented 2 years ago

spanner-cli already have the MySQL-styleSHOW CREATE TABLE <table_name> pseudo-statement. What about generic SHOW CREATE <object_type> <object_name> to show the DDL?

apstndb commented 2 years ago

The generic SHOW CREATE can also support indexes and views.

spanner> SHOW CREATE INDEX SongsBySongName\G
*************************** 1. row ***************************
Name: SongsBySongName
 DDL: CREATE INDEX SongsBySongName ON Songs(SongName)
1 rows in set (0.47 sec)

spanner> SHOW CREATE CHANGE STREAM EverythingStream\G
*************************** 1. row ***************************
Name: EverythingStream
 DDL: CREATE CHANGE STREAM EverythingStream
  FOR ALL
1 rows in set (0.37 sec)

spanner> SHOW CREATE VIEW SingerNames\G
*************************** 1. row ***************************
Name: SingerNames
 DDL: CREATE VIEW SingerNames SQL SECURITY INVOKER AS SELECT
  Singers.SingerId AS SingerId,
  Singers.FirstName || ' ' || Singers.LastName AS Name
FROM Singers
1 rows in set (0.32 sec)

(We should consider about UNIQUE and NULL_FILTERED)

apstndb commented 2 years ago

Prototype https://github.com/cloudspannerecosystem/spanner-cli/compare/master...apstndb:support-generic-show-create

yfuruyama commented 2 years ago

That's a good idea! +1 👍

apstndb commented 2 years ago

Implemented SHOW statements are fallen into three categories.

List all target-typed schema objects.
- SHOW TABLES
- SHOW DATABASES(It is higher level objects than DDLs of database.)
Show details of the target-typed schema objects filtered by parent name.
- SHOW INDEX FROM <table>
- SHOW COLUMNS FROM <table>
Show the DDL for the specified object.
- SHOW CREATE TABLE

3 can be generalized as discussed but I think 1(SHOW CHANGE STREAMS) is also needed.

I think it can be generalized as the sugar for SELECT <known name column or *> FROM INFORMATION_SCHEMA.<snake_cased_object_type>. (We should consider about _CATALOG and _SCHEMA columns like TABLE_SCHEMA of the INFORMATION_SCHEMA.TABLES)

yfuruyama commented 2 years ago

Thanks for summarizing that. I totally agree with the summarized categories.

The user story might be running SHOW CHANGE STREAMS at first to list the change streams, then running SHOW CREATE CHANGE STREAM to get the details of the change stream:

spanner> SHOW CHANGE STREAMS;
+-------------------+
| Change_Streams    |
+-------------------+
| EverythingStream  |
| NamesAndAlbums    |
| SingerAlbumStream |
+-------------------+

spanner> SHOW CREATE CHANGE STREAM EverythingStream;
+------------------+-----------------------------------------------+
| Change_Stream    | Create_Change_Stream                          |
+------------------+-----------------------------------------------+
| EverythingStream | CREATE CHANGE STREAM EverythingStream FOR ALL |
+------------------+-----------------------------------------------+

I think it can be generalized as the sugar for SELECT <known name column or *> FROM INFORMATION_SCHEMA..

I'm not sure if we can assemble the proper CREATE xxx statement in a general way from INFORMATION_SCHEMA database. Perhaps it would be easier to scrape from the DDL as comment: https://github.com/cloudspannerecosystem/spanner-cli/issues/137#issuecomment-1150799021

apstndb commented 2 years ago

I'm not sure if we can assemble the proper CREATE xxx statement in a general way from INFORMATION_SCHEMA database. Perhaps it would be easier to scrape from the DDL as comment: https://github.com/cloudspannerecosystem/spanner-cli/issues/137#issuecomment-1150799021

I agree it is impossible to assemble DDLs from INFORMATION_SCHEMA in a general way and it is possible using projects.instances.databases.getDdl. We should use projects.instances.databases.getDdl for the SHOW CREATE statement.

As for SHOW <schema_types>, we can simply query INFORMATION_SCHEMA even if the scheme object type is unknown.

spanner> SHOW CHANGE STREAMS;
Warning: `SHOW CHANGE STREAMS` has not yet natively supported.
Fallback to `SELECT * FROM INFORMATION_SCHEMA.CHANGE_STREAMS` (Subject to change in future versions)
+-----------------------+----------------------+--------------------+-------+
| CHANGE_STREAM_CATALOG | CHANGE_STREAM_SCHEMA | CHANGE_STREAM_NAME | ALL   |
+-----------------------+----------------------+--------------------+-------+
|                       |                      | Empty              | false |
|                       |                      | EverythingStream   | true  |
|                       |                      | NamesAndAlbums     | false |
|                       |                      | SingerAlbumStream  | false |
+-----------------------+----------------------+--------------------+-------+

In thought experimental, most of SHOW <object_type_plural> can be implemented by the same shape of query.

SELECT <prefix>_NAME FROM INFORMATION_SCHEMA.<object_type_plural_snake> WHERE (<parent_prefix>_CATALOG, <parent_prefix>_SCHEMA) = ("", "")

SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "")
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS WHERE (CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA) = ("", "")
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "")
SELECT CONSTRAINT_NAME FROM INFORMATION_SCHEMA.CHECK_CONSTRAINTS WHERE (CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA) = ("", "")
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "")
SELECT CHANGE_STREAM_NAME FROM INFORMATION_SCHEMA.CHANGE_STREAMS WHERE (CHANGE_STREAM_CATALOG, CHANGE_STREAM_SCHEMA) = ("", "")

Note: INFORMATION_SCHEMA.INDEXES will not be usable without filtering out PRIMARY KEY. _CATALOG _SCHEMA is usually empty.

yfuruyama commented 2 years ago

As for SHOW , we can simply query INFORMATION_SCHEMA even if the scheme object type is unknown.

Sounds good.

In thought experimental, most of SHOW can be implemented by the same shape of query. SELECT _NAME FROM INFORMATION_SCHEMA. WHERE (_CATALOG, _SCHEMA) = ("", "")

How can we get the parent_prefix from the given object_type in a general way? For example: SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "").

apstndb commented 2 years ago

How can we get the parent_prefix from the given object_type in a general way? For example: SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS WHERE (TABLE_CATALOG, TABLE_SCHEMA) = ("", "").

I don't think it can possible to get the parent_prefix in a general way. It is only for thinking the consistent behavior of SHOW < object_type> statements.

yfuruyama commented 2 years ago

Yeah perhaps we have to implement each SHOW <object> statement separately so that we can use an appropriate table/column from INFORMATION_SCHEMA.

cloudspannerecosystem / spanner-cli