Closed sosna closed 3 years ago
The impact on availability queries must still be assessed. The 2 queries must be aligned.
We must support the possibility to check whether there are data for a particular code (say, a country code), regardless of the underlying DSD(s).
Data queries need to be improved for SDMX 3.0. In a nutshell, it is proposed to
Add support for multiple keys
Currently, the
key
parameter only support one (possibly partial) key. The+
operator can be used to supply more than one value for any dimension. This works well if the Cartesian product of the dimensions where the+
operator has been used, represents what you want. If not, having the option of supplying multiple partial keys would work better but this is currently not supported.For example, let’s imagine a fictive inflation DSD made of the following dimensions:
FREQ
,REF_AREA
,INFLATION_ITEM
,SOURCE
,TYPE
. Let’s imagine that there are several values for type, such as the index (INX
), the weights (INW
), the annual rate of change (ANR
) and the contribution to growth (CTG
). Let’s say that source A supplies data for all 4 types and source B only for 2 (ANR
&CTG
). You wantINX
andANR
from source A andCTG
from source B.The current API would allow a query like the following:
M…A+B.ANR+INX+CTG
.However, this is not what we want, as it would return
ANR
andCTG
data for both sources A & B. It is therefore proposed to allow using a comma, to separate (possibly partial) keys:M…A.ANR,M…A.INX,M…B.CTG
Extend the context of data retrieval
The first path parameter of the current data queries holds a reference to the dataflow of the data to be returned. It must resolve to one single artefact.
It is proposed to modify this parameter to:
datastructure
andprovisionagreement
);By doing this, we can:
It is proposed to use the same path parameters as for the structure and schema queries, thereby aligning all query types.
To retrieve all data structured according to the latest version of the
ECB_EXR1
DSD maintained by the ECB, the following query could be performed:https://ws-entry-point/data/datastructure/ECB/ECB_EXR1/latest
Add a “cube-based” data retrieval, in addition to the current “key-based” one
The current API requires knowledge of the series key. This can be tedious in some cases, even more so in case of DSDs with many dimensions. It is therefore proposed to add support for a cube-based filtering mechanism.
To reuse the fictive DSD mentioned above, in case you want to retrieve all public inflation data about Switzerland (i.e. neither confidential, nor restricted), the following would be enough:
https://ws-entry-point/data/dataflow/ESTAT/ICP?c[REF_AREA]=CH&c[CONF_STATUS]=F
This mechanism could support multiple values. For example, in case you want to retrieve all public inflation data about Switzerland and Germany, the following would be enough:
https://ws-entry-point/data/dataflow/ESTAT/ICP?c[REF_AREA]=CH,DE&c[CONF_STATUS]=F
Furthermore, support for operators could be introduced. For example, to retrieve all inflation data about Switzerland and Germany, for reporting periods in 2018 or above the following would be enough:
https://ws-entry-point/data/dataflow/ESTAT/ICP?c[REF_AREA]=CH,DE&c[TIME_PERIOD]=ge:2018
When the operator is not specified, and there is only one value, it would default to meaning “equals to”. When the operator is not specified, and there are multiple values, it would default to meaning “or”.
It is proposed to start with the following operators:
If cube-based filters are introduced, the
startPeriod
andendPeriod
query parameters are no longer needed and may be removed from the API.The previous key-based query mechanism would continue to be supported by introducing a new special parameter for key based queries:
https://ws-entry-point/data/dataflow/ESTAT/ICP?key=M…A.ANR,M…A.INX,M…B.CTG
Obviously, it should be possible to combine both key-based and cube-based filtering mechanisms:
https://ws-entry-point/data/dataflow/ESTAT/ICP?key=M…A.ANR,M…A.INX,M…B.CTG&c[OBS_STATUS]=N
.Harmonizing separators
Although potentially painful, we propose to take the opportunity offered by SDMX 3.0 to harmonize how separators are used.
The proposal is to use:
,
) for or statements+
) for and statements.For example the following filter
c[OBS_STATUS]=B+M,A
would mean that:OBS_STATUS
is an attribute that supports multiple valuesOBS_STATUS
is either the single valueA
or the combination ofB
andM
.Examples
Key-based vs. cube-based vs. combined queries:
https://ws-entry-point/data/dataflow/ECB/EXR/latest?key=M.CHF..,M.GBP..
https://ws-entry-point/data/dataflow/ECB/EXR/latest?c[FREQ]=M&c[CURRENCY]=CHF,GBP
https://ws-entry-point/data/dataflow/ECB/EXR/latest?key=M.CHF..,M.GBP..&c[OBS_STATUS]=N
Querying across all structures
https://ws-entry-point/data/all/all/all/latest?c[REF_AREA]=CH,GR,UK
Using operators
Retrieve all inflation data for the food and non-alcoholic beverages category (code starting with 01) for Germany (DE), starting from 2015 and after.
https://ws-entry-point/data/dataflow/ESTAT/ICP?c[REF_AREA]=DE&c[ICP_ITEM]=sw:01&c[TIME_PERIOD]=ge:2015