sdmx-twg / vtl

This repository is used for maintaining the SDMX-VTL specification
11 stars 7 forks source link

Specification of these operators for Uploading a datafile into a dataset. #160

Closed stratosn closed 1 year ago

stratosn commented 7 years ago
reporter issue reference document (UM/RM/EBNF) page line
MC-50 RM 45 Get, put, eval, 45-50

Issue Description

These operators are not well defined. They are implementation-dependent. They have many options that are not justified. VTL knows which are the datasets (temporarily) created by the assignment statement :=. All other datasets are persistent. There is no need to define a special operator to get them. Of course the syntax to denote the persistent dataset will be in any case implementation-dependent.

Proposed Solution

eval this operator should be removed because it is not well defined and is implementation-dependent (i.e. it will behave differently on different systems). As an alternative, eval could be kept in VTL only if it is limited to execute an external program (not returning a dataset). get use get to load data from an external file. In general the format of the file is implementation-dependent but we can define a standard format: each line of the file contains fields (separated by a separator) and each field contains the value of a dataset component. The first line contains the names of the components. put use put to write data to an external file. In general the format of the file is implementation-dependent but we can define a standard format: each line of the file contains fields (separated by a separator) and each field contains the value of a dataset component. The first line contains the names of the components. As the file format generated by put has the same structure of the file format accepted by get then the get can load data from a file produced by put. update define a new operator to update a persistent dataset (temporary or persistent). The update operator behaves similarly to the put operator - it stores some data in a persistent dataset.

bellomarini commented 7 years ago

eval specifies the target language. Its execution is of course implementation-dependent. I agree to remove the output Dataset. It should only be an indication of the execution outcome (SUCCESS, FAIL). It has to be therefore responsibility of the eval implementation to persist the outcomes. This approach has limitations, but I agree it is safer.

get loads data from any archive. We should introduce the concept of archive, that is a configuration-dependent data structure that maps the logical Datasets into their physical representation (CSV, tables, etc...)

update agree.

egreising commented 7 years ago

I agree with the proposed restriction to eval. However, I would suggest adding a new parameter to define the scope delimiter as the comma that separates parameters may be part of the embedded code in the target language, so it is not possible to know where the script ends.

The syntax should be:

eval (Constant language, [{script=}Constant script | Constant programPath], {,{params=}ConstantList<?> parameterList} , {delimiter=}Constant delimiter)

Parameters language: string script: string programPath : string parameterList : list delimiter : string

language – is the programming language of the script. script – is the code of the script in language. programPath – a path to a script file. parameterList – the List of input parameters for the script. delimiter - is a string of any length that signals the start and end of the script e.g. ##>

capacma commented 7 years ago

I agree with @bellomarini that the get operator needs extra information, for example under the current definition of get it is not possible to recognize whether the dataset is stored in a file or in a database table.

capacma commented 7 years ago

Proposal about get when it is used to load a data file: add parameter "archive" specifying the data file (implementation dependent)

linardian commented 1 year ago

Refers to old version of documentation