ActianCorp / spark-vector

Repository for the Spark-Vector connector
Apache License 2.0
20 stars 9 forks source link

Optimized unload query to 'select count(*) from table' instead of 'select *' or 'select 1' in case we have an empty requiredColumns #53

Closed cbarca closed 8 years ago

cbarca commented 8 years ago

This is similar with [1]. Seemed a better approach than "select 1 from table where ..." which would've read num_table_rows of <1> tuples through the network.

[1] https://github.com/databricks/spark-redshift/blob/master/src/main/scala/com/databricks/spark/redshift/RedshiftRelation.scala

and-costea commented 8 years ago

ship