Optimized unload query to 'select count(*) from table' instead of 'select *' or 'select 1' in case we have an empty requiredColumns

ActianCorp / spark-vector

Repository for the Spark-Vector connector

Apache License 2.0

20 stars 9 forks source link

Optimized unload query to 'select count() from table' instead of 'select ' or 'select 1' in case we have an empty requiredColumns #53

Closed cbarca closed 8 years ago

cbarca commented 8 years ago

This is similar with [1]. Seemed a better approach than "select 1 from table where ..." which would've read num_table_rows of <1> tuples through the network.

[1] https://github.com/databricks/spark-redshift/blob/master/src/main/scala/com/databricks/spark/redshift/RedshiftRelation.scala

and-costea commented 8 years ago

ship

ActianCorp / spark-vector

Optimized unload query to 'select count(*) from table' instead of 'select *' or 'select 1' in case we have an empty requiredColumns #53

Optimized unload query to 'select count() from table' instead of 'select ' or 'select 1' in case we have an empty requiredColumns #53