Testing | |
Package | |
License |
The VTL Engine is a Python library for validating and running VTL scripts.
It is a Python-based library around the VTL Language.
The VTL Engine requires Python 3.10 or higher.
To install the VTL Engine on any Operating System, you can use pip:
pip install vtlengine
Note: it is recommended to install the VTL Engine in a virtual environment.
The VTL Engine API implements two basic methods:
Any action with VTL requires the following elements as input:
None
,
which shall be used if external routines are not applicable to the VTL script.None
, which shall be used if value domains are not applicable to the VTL script.The semantic_analysis
method serves to validate the correctness of a VTL script, as well as to
calculate the data structures of the datasets generated by the VTL script itself (that calculation
is a pre-requisite for the semantic analysis).
from vtlengine import semantic_analysis
script = """
DS_A := DS_1 * 10;
"""
data_structures = {
'datasets': [
{'name': 'DS_1',
'DataStructure': [
{'name': 'Id_1',
'type':
'Integer',
'role': 'Identifier',
'nullable': False},
{'name': 'Me_1',
'type': 'Number',
'role': 'Measure',
'nullable': True}
]
}
]
}
sa_result = semantic_analysis(script=script, data_structures=data_structures)
print(sa_result)
Returns:
{'DS_A': Dataset(name='DS_A', components={'Id_1': Component(name='Id_1', data_type=<class 'vtlengine.DataTypes.Integer'>, role=<Role.IDENTIFIER: 'Identifier'>, nullable=False), 'Me_1': Component(name='Me_1', data_type=<class 'vtlengine.DataTypes.Number'>, role=<Role.MEASURE: 'Measure'>, nullable=True)}, data=None)}
Note that, as compared to Example 1, the only change is that Me_1 is of the String data type, instead of Number.
from vtlengine import semantic_analysis
script = """
DS_A := DS_1 * 10;
"""
data_structures = {
'datasets': [
{'name': 'DS_1',
'DataStructure': [
{'name': 'Id_1',
'type':
'Integer',
'role': 'Identifier',
'nullable': False},
{'name': 'Me_1',
'type': 'String',
'role': 'Measure',
'nullable': True}
]
}
]
}
sa_result = semantic_analysis(script=script, data_structures=data_structures)
print(sa_result)
Will raise the following Error:
raise SemanticError(code="1-1-1-2",
vtlengine.Exceptions.SemanticError: ('Invalid implicit cast from String and Integer to Number.', '1-1-1-2')
The run
method serves to execute a VTL script with input datapoints.
Returns a dictionary with all the generated Datasets. When the output parameter is set, the engine will write the result of the computation to the output folder, else it will include the data in the dictionary of the computed datasets.
Two validations are performed before running, which can raise errors:
semantic_analysis
methodfrom vtlengine import run
import pandas as pd
script = """
DS_A := DS_1 * 10;
"""
data_structures = {
'datasets': [
{'name': 'DS_1',
'DataStructure': [
{'name': 'Id_1',
'type':
'Integer',
'role': 'Identifier',
'nullable': False},
{'name': 'Me_1',
'type': 'Number',
'role': 'Measure',
'nullable': True}
]
}
]
}
data_df = pd.DataFrame(
{"Id_1": [1, 2, 3],
"Me_1": [10, 20, 30]})
datapoints = {"DS_1": data_df}
run_result = run(script=script, data_structures=data_structures,
datapoints=datapoints)
print(run_result)
returns:
{'DS_A': Dataset(name='DS_A', components={'Id_1': Component(name='Id_1', data_type=<class 'vtlengine.DataTypes.Integer'>, role=<Role.IDENTIFIER: 'Identifier'>, nullable=False), 'Me_1': Component(name='Me_1', data_type=<class 'vtlengine.DataTypes.Number'>, role=<Role.MEASURE: 'Measure'>, nullable=True)}, data= Id_1 Me_1
0 1 100.0
1 2 200.0
2 3 300.0)}
For more information on usage, please refer to the API documentation.