FoodOntology / foodon

The core repository for the FOODON food ontology project. This holds the key classes of the ontology; larger files and the results of text-mining projects will be stored in other repos.
Creative Commons Attribution 4.0 International
183 stars 36 forks source link

FoodOn synonym usage #221

Open esdeboer opened 2 years ago

esdeboer commented 2 years ago

I want to use the FoodOn ontology to make an intelligent recipe search engine.

For this I want to use the synonyms to match ingredients to the FoodOn classes and I'm wondering what is the logic behind them, should the synonym be on the highest class for the term, or should it be on the class that's most used in every day speech (which might be different in different locales)

With Herbs the synonyms are used on both levels, for example Garlic synonym is on the garlic bulb (whole, raw) not on the garlic food product. While for dill it is at dill food product, while it's commonly used for the dill leaf(raw), which has synonyms dill spice (which probably should be on dill seed) and dill weed but not dill. And with cloves the clove synonym is both on clove(whole) and clove(whole,dried), while cloves food product has cloves as synonym.

We have "cow milk raw" with synonym milk, I guess because that's the milk is associated with cows milk (at least in eu/us), but in a consumer context it might be that it is not raw milk that is meant by milk, but rather the cow milk (pasteurized) one.

'cow milk (semi-skimmed)' does not have a synonym, it probably should have the synonym semi-skimmed milk to be consistent with the milk synonym above. However we could also claim we don't know which type of milk is meant so this should be a generic milk (semi-skimmed) class, which however does not exist.

maweber-bia commented 2 years ago

I agree with you that the labels need to be rearranged in places for consistency, this is a work in progress

The examples cited will help to establish some rules of good practice to harmonize the labels, especially how to use parenthesis versus comma separated term in the labels

@ddooley : on a more conceptual level, we have to distinguish between part/whole and raw/processed materials to make the difference clear between those concepts, I am sure that you are already thinking about!

ddooley commented 1 year ago

Apologies for delay in response here! Yes, the synonymy is complicated and needs a term-by-term review. Adding to the situation is that we are often bringing synonyms automatically with some resources like NCBITaxon. We are going to be grappling with broad/narrow vs exact synonymy in an upcoming FoodOn curation call. Suffice to say, the difference in reference of a word based on its context (farmer growing cauliflower in a field vs home chef getting some in the fridge) requires guidelines for use of broad vs exact synonym. Ideally the resulting synonymy is of use in text mining applications. More on this soon.