crate / cratedb-toolkit

CrateDB Toolkit, an SDK for CrateDB and CrateDB Cloud.
https://cratedb-toolkit.readthedocs.io/
GNU Affero General Public License v3.0
7 stars 3 forks source link

Documentation: Improve docs for `ctk load table` #221

Open amotl opened 2 months ago

amotl commented 2 months ago

Just a small thing that it may be worth mentioning in the docs, the schema translation part takes the name of the MongoDB collection for the CREATE TABLE statement but the data load part takes the table name from the CRATEDB_SQLALCHEMY_URL so these 2 have to match for ctk load table to work.

Originally posted by @hlcianfagna in https://github.com/crate/cratedb-toolkit/pull/216#pullrequestreview-2232840386

amotl commented 2 months ago

Hi Hernan. Thanks for your suggestion. I am sure there are anomalies, but I can't spot them, probably because of operational blindness, and your description doesn't tell me anything where to apply an improvement, and what, at least not enough to make it actionable for me.

I know that migr8 splits the procedure into two phases, and that ctk load table bundles it. In this spirit, I don't know where to discriminate, and/or improve the documentation. Apologies.

Can I humbly ask you to submit a corresponding suggestion how and where to improve the documentation, either by commenting on the patch GH-216, or by submitting a separate one? Thanks a stack!

hlcianfagna commented 3 weeks ago

Hi, apologies for the delay coming back to you on this. This is for the case where we may want the table on CrateDB to have a different name from the collection in MongoDB (a real world example involved the need to filter what was pulled from MongoDB which was done with a view on the MongoDB side - but the final desired state in CrateDB was to have a table with the collection's original name not the view's name), when I tested this it looked like ctk load table would only work if those 2 matched, which is not obvious as the fact we can define a target table name in the CRATEDB_SQLALCHEMY_URL suggests we can use any name for the destination table, so we could mention this where we introduce CRATEDB_SQLALCHEMY_URL in the documentation or we could think of somehow amending the CREATE TABLE to use the table name from the connection string.

amotl commented 2 weeks ago

when I tested this it looked like ctk load table would only work if those 2 matched, which is not obvious as the fact we can define a target table name in the CRATEDB_SQLALCHEMY_URL suggests we can use any name for the destination table

Ah I see. Thanks for clarifying. Yes, in theory, the target table should be defined by CRATEDB_SQLALCHEMY_URL. If it's not, it's certainly a bug that should be fixed. Thanks for the report!