Open raphaelauv opened 2 weeks ago
@raphaelauv is the idea that this is useful for instances where you want to ensure that only a single file is created? As opposed to CTAS, where multiple files may be created, and you may not want to register this file with a metastore?
UNLOAD is for exporting data "out of prestodb", not creating a new table
@raphaelauv understood. But I suppose the question is around what do you define to be out of PrestoDB? If you create a connector that wrote to S3, like a Hive or Iceberg connector configured in this way, and did a CTAS inserting into a table in this connector, this would create Parquet files in S3, i.e. out of PrestoDB. Table is just an abstraction over this concept, but you're free to go to the files directly. So the question is, what would UNLOAD do for you that CTAS into an unpartitioned table doesn't already do?
Out of prestodb for any application code that is not capable ( or forbidden for any technical or organisational reason ) of connecting to prestodb
Expected Behavior or Use Case
Like what is possible with aws athena - https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html
write the result of a query to a destination like S3 in PARQUET / CSV / JSONL ...
Context
Would be great to write directly the result of a query from prestodb