Closed capacma closed 1 year ago
Comments by Maurizio clauses.docx get.docx join.docx
Second version by Luigi join_20170911.docx
A comment on the "calc" operator.
the user must have the possibility to specify not only the name of the component to be added to the dataset, but also the data type. If I write:
ds [ calc attribute obs_status := "P" ]
then the data type of obs_status is "string". Now suppose that I want the type to be "obs_status", who do I do this? Maybe with a syntax like the following:
ds [ calc attribute obs_status obs_status := "P" ]
Note that this feature applies both to the calc in the join and the normal calc clause.
Changes discussed during the teleconference on 13.09.2017
Maurizio, penso che questo che hai postato sia un vecchio documento di Luigi. Il nuovo è datato 19 Settembre Ciao Laura
----- Messaggio originale ----- Da: "Maurizio" notifications@github.com A: "vtl-sdmx-task-force/sdmx-vtl" sdmx-vtl@noreply.github.com Cc: "Laura Vignola" vignola@istat.it, "Assign" assign@noreply.github.com Inviato: Mercoledì, 20 settembre 2017 8:33:19 Oggetto: Re: [vtl-sdmx-task-force/sdmx-vtl] get join clauses (#363)
Doc by Luigi revised + comments by Maurizio join_20170913.docx
-- You are receiving this because you were assigned. Reply to this email directly or view it on GitHub: https://urlsand.esvalabs.com/?u=https%3A%2F%2Fgithub.com%2Fvtl-sdmx-task-force%2Fsdmx-vtl%2Fissues%2F363%23issuecomment-330758251&e=e7a274c6&h=2292e6bc&f=n&p=y
OK thanks Laura, the correct document is attached below Document revised by Luigi + comments by Maurizio join_20170919 LB MC.docx
Question by Laura: if I apply the join with only one Data Set I don't need to do any join. Could not be possible to omit the join type in case of one single dataset? Maurizio: In theory the join operators has 2 or more operand datasets (by definition). In VTL 1.1. the type of join was optional therefore we could apply the operator to 1 dataset (the syntax was "[ds ] { clauses }" ). But in the new syntax it is a bit strange to say "inner_join ( ds )" and I think that even "join(ds)" is not much better.
What we could do is to say that the join operators can have 2 or more operand dataset (thus avoiding 1 single operand) and allow the "apply" in the external clauses (external clause: not in a join operator). In this way we have all features.
Indeed, as Vincenzo said the last time, we need that the join operator is applied also to only one dataset because it allows to define the unary operators that are in the standard library. I also found a very old document provided by Luigi in which all the operator of the standatd library were defined using the join expressin. Below I report an example:
CREATE FUNCTION upper (Dataset<?+, MeasureComponent
SO I think it is not good to restrict the join to only one operand. Another possible idea is to put the specification of the join not mandatory with the inner join as default (of course in case of only one operand, whatever the join is, the result is always the same input dataset.
Clarification about the calc operator (in a join): the component calculated by calc overrides (i.e. drop) automatically all components of the operand datasets with the same name. Example (d1, d2 have 1 numerical measure m1 and no attributes): inner_join ( d1, d2 calc m1 := d1#m1 + d2#m1 ) in this example m1 is eliminated automatically both from d1 and d2. Otherwise the user has to write explicitly the drop.
Clarification about the apply operator: In the example: apply d1 + d2 + d3 does VTL require that all measures of d1 are numeric does VTL require that all measures of d1 are also in d2 and d3 (i.e. d1, d2, d3 have the same measures) Note that for the "normal" operators we said that A + B raises an error when A and B have non-numerical measures.
@capacma , I agree with the clarification about the calc in a join. Maybe the "automatic alias removal" part should be improved.
In particular, now I wrote that the "alias#compname" is automatically renamed into "compname" and the user must take care of avoiding duplicate names.
I suggest that we add: "in the presence of conflicts between homonym component names, after the automatic alias removals, calculated components prevail on others." This would consistently give the behavior you mention, since would consist in an implicit drop.
Comments discussed on 20 September 2017
Document on the clauses by Luigi +comments by Maurizio: clauses_20171002.docx
Additional comment: what is the syntax to combine two or more clauses? the same agreed for the join?
On the pivot/unpivot: it would be useful to define elem_list as optional in the syntax: [pivot { elem_list } to dim , msr ] if elem_list is missing then VTL uses the values of dim in the ds
Further comment on the calc:
Further comment on calc/subspace:
Last version by Luigi clauses_20171011.docx
Notes on obsolete version of documentation
Issue Description
get, join, clauses
Proposed Solution
Proposal by Luigi on get, join, clauses clauses.docx join.docx