They are very much needed for virtualization when dealing with denormalized data, which is almost the de facto norm when data comes from files (e.g. JSON files, CSV extracts, Excel).
To give an example, let's consider a table describing persons and their address. It has an unique constraint over the person_id.
With functional dependencies, we can further declare that the city_id determines the city_name and the region_id, that the region_id determines the region_name and the country_id and so on and so forth. These dependencies are not unique because they are repeated in many rows.
Functional dependencies are also typically transitive (e.g. city_id determines in the end the country_id). It is important to let the processor compute the transitive closure, as specifying it manually can be very cumbersome and error-prone.
At Ontopic, we use them a lot in projects. In particular, they are key for dealing with JSON-like data structures in large datasets available in platforms like BigQuery or SparkSQL. Without them, virtualization wouldn't have been feasible over these large datasets.
Functional dependencies enable some self-inner-join and self-left-join optimizations.
Functional dependencies are a generalization of unique constraints.
They are very much needed for virtualization when dealing with denormalized data, which is almost the de facto norm when data comes from files (e.g. JSON files, CSV extracts, Excel).
To give an example, let's consider a table describing persons and their address. It has an unique constraint over the
person_id
. With functional dependencies, we can further declare that thecity_id
determines thecity_name
and theregion_id
, that theregion_id
determines theregion_name
and thecountry_id
and so on and so forth. These dependencies are not unique because they are repeated in many rows.Functional dependencies are also typically transitive (e.g.
city_id
determines in the end thecountry_id
). It is important to let the processor compute the transitive closure, as specifying it manually can be very cumbersome and error-prone.This feature is supported in Ontop lenses: https://ontop-vkg.org/guide/advanced/lenses.html#otherfunctionaldependency .
At Ontopic, we use them a lot in projects. In particular, they are key for dealing with JSON-like data structures in large datasets available in platforms like BigQuery or SparkSQL. Without them, virtualization wouldn't have been feasible over these large datasets.
Functional dependencies enable some self-inner-join and self-left-join optimizations.