It looks like we should support direct DoPut call for simple ETL/ELT purposes. With ability to make it more complicated later (CommandStatementUpdate).
BulkUpsert supports Apache Arrow as data source format. So it's quite easy to pass data itself. There could be some troubles to implement GetFlightInfo for it and to make a stream.
So it's a good fist task in case of one portion insertion. And it has several extensions:
implement insert stream (insert multiple batches in one connection)
implement client sharding (one GetFlightInfo + parallel multiple DoPut into several nodes)
support complex SQL statement over inserted data (i.e. insert with type conversion or filter)
We have BulkUpsert method to insert big portions of data. https://ydb.tech/en/docs/reference/ydb-sdk/recipes/bulk_upsert/
We also want to support Apache Arrow Flight interface for this issue:
It looks like we should support direct DoPut call for simple ETL/ELT purposes. With ability to make it more complicated later (CommandStatementUpdate).
BulkUpsert supports Apache Arrow as data source format. So it's quite easy to pass data itself. There could be some troubles to implement GetFlightInfo for it and to make a stream.
So it's a good fist task in case of one portion insertion. And it has several extensions: