Closed Quaffel closed 2 years ago
Improved Wikidata's database for the following entries:
SPARQL query for the Wikidata Query Service to retrieve all units used in all items that we consider cocktails:
SELECT DISTINCT ?unitLabel ?ingredientLabel ?cocktailLabel
WHERE
{
# Retrieves all entities that are at least one of the following:
# - an instance of "cocktail"
# - a subclass of "cocktail"
# - an instance of a subclass of "cocktail"
?cocktail wdt:P31?/wdt:P279* wd:Q134768.
# Filter out classes (instances of Q16889133; "classes")
FILTER NOT EXISTS {
?cocktail wdt:P31 wd:Q16889133
}
?cocktail p:P186 ?ingredientStatement.
?ingredientStatement ps:P186 ?ingredient;
pqv:P1114/wikibase:quantityAmount ?ingredientAmount;
pqv:P1114/wikibase:quantityUnit ?unit.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
SPARQL query to retrieve all ingredients that have no specified unit (i.e., Q199 ("1")):
SELECT DISTINCT ?unitLabel ?ingredientLabel ?cocktailLabel
WHERE
{
?cocktail wdt:P31?/wdt:P279* wd:Q134768.
FILTER NOT EXISTS {
?cocktail wdt:P31 wd:Q16889133
}
?cocktail p:P186 ?ingredientStatement.
?ingredientStatement ps:P186 ?ingredient;
pqv:P1114/wikibase:quantityUnit ?unit.
VALUES ?unit { wd:Q199 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Now that no cocktail uses "wholes" to describe the amount of fluid ingredients, we should no longer convert them and start displaying them as such. Even though the fact that "whole ingredients" do not constitute to the overall fluid volume is simplified (as ice cubes melt and sugar cubes dissolve), it should be good enough for our purposes.
When fetching data from Wikidata, we normalize the volumes ourselves. This also holds for the unit with the unit symbol "1" which represents "wholes", such as "1 banana" or "1 lemon slice". In Wikidata, this unit (Q199 ("1")) is used whenever no unit is specified.
This leads to some interesting issues:
DrinkCard
s. This leads to rather interesting amounts such as "89ml strawberry".I performed a small analysis to figure out how significant this issue is. A query on all of the ingredients that make use of this unit yields the following results:
Entries with a
(x2)
indicator were queried twice instead of just once. For me, it is unclear under which circumstances duplicates can occur.