Add export functionality

GoogleCodeExporter commented 9 years ago

Till to fill in details.

Original issue reported on code.google.com by vinay...@gmail.com on 2 Aug 2013 at 7:07

GoogleCodeExporter commented 9 years ago

Here's a first proposal. It is relatively vague, so questions on underspecified 
details are appreciated!

It should be possible to export datasets or dataverses:

  "export" "dataset" QualifiedName "using" AdapterName Configuration

exports
- the data of a dataset in a format that's compatible with the load
  statement and
- the AQL statements needed to re-create the dataset and its indexes

  "export" "dataverse" "using" AdapterName Configuration

exports
- all datasets of a dataverse and
- the AQL statements needed to re-create all datasets, indexes, and
  functions

The adapters should use the equivalent configuration options as for loading and 
external tables.
The "path" should point to a directory that contains separate files for 
- the AQL statements (create.aql) and
- each dataset (<dataset-name>.<format-suffix>)
and an output-format should be specified for hdfs.

Initial questions:
1) Is the adm format good enough to be round-trippable?
2) Do we need a more efficient binary export format?
3) How do we handle the evolution of e.g. the adm-format?
   Do we introduce a version identifier?

Original comment by westm...@gmail.com on 5 Aug 2013 at 7:00

Added labels: Type-Enhancement
Removed labels: Type-Defect

GoogleCodeExporter commented 9 years ago

Here's a second proposal (extended with feedback from Mike):

It should be possible to export datasets or dataverses:

  ExportArtifact ::= "sample"? "data"
                   | "schema"
                   | "configuration"
  ExportArtifacts ::= ExportArtifact ("," ExportArtifact)*

  "export" "dataset" QualifiedName ("with" ExportArtifacts)?
           "using" AdapterName Configuration

should export a dataset and

  "export" "dataverse" Identifier? ("with" ExportArtifacts)? 
           "using" AdapterName Configuration 

should export all datasets of a dataverse. If no dataverse Identifier is given 
the current default dataverse should be exported. 
When exporting datasets it should be possible to export the full data, a sample 
of the data, the AQL statements needed to re-create the dataset and its 
indices, or the system configuration. When exporting a dataverse with the 
schemas, the AQL statements needed to re-create the functions in the dataverse 
should be exported as well. If no artifacts to export are given, the full data 
should be exported.
The data of a dataset should be exported in a format that's compatible with the 
load statement.
The adapters should use the equivalent configuration options as for loading and 
external tables. The "path" should point to a directory that contains separate 
files for
- the AQL statements (create.aql),
- each dataset (<dataset-name>.<format-suffix>), 
- the cluster configuration, and
- the system configuration.
For export to hdfs an output format should be specified as well.

Question:
Should we export the configuration with a dataset/dataverse or should this be a 
separate export? (The scope of the configuration seems to be a bit bigger than 
the scope of the datasets/dataverses.)

Original comment by westm...@gmail.com on 14 Aug 2013 at 8:43

GoogleCodeExporter commented 9 years ago

Looks good - except where is the "path" that you are mentioning at the 
end?  (Is my memory mis-firing or is there a puzzle piece missing in the 
syntax? :-))

Agreed about the config scope - but - seems okay like this, i.e., this 
is a way to ask for that info to go along for the ride with an export of 
something else.  If you ONLY want to export the config, I guess you 
could just export the current default DV with that as the only option as 
the least-characters way to get that info - which seems like it works.  :)

Original comment by dtab...@gmail.com on 14 Aug 2013 at 7:33

GoogleCodeExporter commented 9 years ago

The "path" is one of the parameters that the current adapter configurations 
get. So it is in the syntax, but now in a way that is visible to the human eye. 
I agree that it would have been really helpful to write that down :)

Original comment by westm...@gmail.com on 15 Aug 2013 at 2:29

GoogleCodeExporter commented 9 years ago

We should replace "with" with "including".

Original comment by westm...@gmail.com on 16 Aug 2013 at 9:42

GoogleCodeExporter commented 9 years ago

status: no work has happened here

Original comment by westm...@gmail.com on 30 Jul 2014 at 4:32

lwhay / asterixdb

Add export functionality #598