Closed capacma closed 11 months ago
The membership operator, when applied to a measure or attribute component, retrieves a dataset having: the same identifiers of the input dataset and the measure selected by the membership. Under the assumption that the ds1 identifiers are SEX and REF_AREA, the result of:
ds2:=ds1.OBS_VALUE_2015
where
ds1
SEX | REF_AREA | OBS_VALUE_2014 | OBS_VALUE_2015 |
---|---|---|---|
M | IT | 146 | 152 |
F | IT | 160 | 180 |
is:
ds2
SEX | REF_AREA | OBS_VALUE_2015 |
---|---|---|
M | IT | 152 |
F | IT | 180 |
It is not instead, specified what is the resulting dataset when an identifier is selected, for example what is the result of ds1.SEX where ds1 is the dataset used in the example above?
There are two different proposals for it:
retrieves
ds2
SEX | REF_AREA | STRING_VAR |
---|---|---|
M | IT | M |
F | IT | F |
retrieves
ds2
SEX | REF_AREA | CONDITION |
---|---|---|
M | IT | true |
F | IT | true |
In both cases when the "membership on identifiers" is applied a VTL operator, the operator must have the following characteristics:
Under these conditions the result of:
ds2:=ds1.SEX="M"
will be in both cases
SEX | REF_AREA | CONDITION |
---|---|---|
M | IT | true |
F | IT | false |
The first proposal seems to me more useful because in this case we have a new measure to be used in a more clear way (because it contains the values of the identifiers). My only concern is that if the resulting dataset of ds1.SEX is a new dataset having all the identifiers and a measure named CONDITION (or whatever) than if we want to compare if the new measure is equal to "M" we should write:
ds_r:= [ds1.SEX].CONDITION="M"
or we should say that ds1.SEX is equivalent (only for identifiers) to [ds1.SEX].CONDITION
About the other issues written by Maurizio I can try to answer in the following way:
For the point 1. raised by @vignola if we decide that the membership operator applied to a dimension creates a new measure (containing the value of the identifier) I think the following expression is correct: ds1.SEX="M" because the left side ds1.SEX has only one measure and can be combined with any dataset having 1 measure (like "M" that is a scalar value). In addition, the measure to be created is not visible in the result of ds1.SEX="M", therefore the name that we choose for the measure is less relevant here.
Refers to old version of documentation
Issue Description
In the current version of VTL it is not possible to write a simple check like the following: check ( ds . ref_area = "IT" ) because the membership operator does not allow to specify an identifier component.
Proposed Solution
Allow to specify an identifier component in the membership operator. The returned dataset has the identifier components of the input dataset and a measure whose name is "computed" and whose value is the value of the identifier component. Note: we cannot simply promote the specified identifier component to a measure because this will potentially create duplications that cannot be eliminated (otherwise the information on the data points is lost).