predict-idlab / pyRDF2Vec

šŸ Python Implementation and Extension of RDF2Vec
https://pyrdf2vec.readthedocs.io/en/latest/
MIT License
246 stars 51 forks source link

Mutag example has property chain to non-literal. What is the proper way to construct literal predicate chains from an ontology? #250

Open timduval-unifylogic opened 8 months ago

timduval-unifylogic commented 8 months ago

ā“ What is the proper way to construct literal predicate chains from an ontology?

I am attempting to create embeddings for an ontology.

I looked through the issues to see if this was answered and maybe I overlooked something but i'm a little confused as to how to construct the predicate chains properly using the mutag example as a reference.

Here is an exerpt from the provided rdflib example where the literal predicate chains are being defined. I noticed that inBond's range is not a literal and that introduced some confusion as to how to construct them correctly from an ontology.

# "samples/mutag/mutag.owl",
skip_predicates={"http://dl-learner.org/carcinogenesis#isMutagenic"}
literals=[
    [
        "http://dl-learner.org/carcinogenesis#hasBond", # domain: Compound , range: Bond
        "http://dl-learner.org/carcinogenesis#inBond",  # domain: Bond, range: Atom  <-- this is not a literal? I'm confused?
    ],
    [
        "http://dl-learner.org/carcinogenesis#hasAtom", # domain: Compound, range: Atom
        "http://dl-learner.org/carcinogenesis#charge",  # domain: Atom, range: xsd:double
    ],
],

Is there any documentation/guidance on how to construct these? Where do I begin the walk of the properties (building the chain) down to the eventual properties whose range is a xsd:primitive?

Does this mean that there could be more than two items in each literal predicate chain entry, as in:

literals=[
    [
        "http://example.com#prop1",    # domain: ClassA, range: ClassB
        "http://example.com#prop2",    # domain: ClassB, range: ClassC
        "http://example.com#prop3",    # domain: ClassC, range: xsd:string 
    ],
]

I have a rather large ontology and I want to make sure I am doing it correctly. Any help is greatly appreciated!!

GillesVandewiele commented 7 months ago

Hi Tim,

The code to extract literals is defined here: https://github.com/IBCNServices/pyRDF2Vec/blob/940ef534cd44698dfb625a0f55a47b781a8dacae/pyrdf2vec/graphs/kg.py#L330

In a nutshell, if you provide a list of entities [e_1, ..., e_n] and a list of predicate walks (e.g. 1 predicate walk could be [pred1, pred2])

Then we look for all walks of form e_i -> pred1 -> * -> pred2 -> literal for every entity in your provided list, where * is a wildcard that can by anything. In case multiple such walks can be found, a list of literals will be returned and you'll have to aggregate it (mean/max/...). You'll be returned a list of n lists (n being the amount of entities) with variable lengths.

Hope that clears it up, if not, feel free to ask!

timduval-unifylogic commented 7 months ago

This is very helpful. Thanks!


From: Gilles Vandewiele @.> Sent: Saturday, March 2, 2024 1:30:29 AM To: IBCNServices/pyRDF2Vec @.> Cc: Tim Duval @.>; Author @.> Subject: Re: [IBCNServices/pyRDF2Vec] Mutag example has property chain to non-literal. What is the proper way to construct literal predicate chains from an ontology? (Issue #250)

Hi Tim,

The code to extract literals is defined here: https://github.com/IBCNServices/pyRDF2Vec/blob/940ef534cd44698dfb625a0f55a47b781a8dacae/pyrdf2vec/graphs/kg.py#L330

In a nutshell, if you provide a list of entities [e_1, ..., e_n] and a list of predicates [pred1, pred2]

Then we look for all walks of form e_i -> pred1 -> -> pred2 -> literal, where is a wildcard that can by anything. In case multiple such walks can be found, a list of literals will be returned and you'll have to aggregate it (mean/max/...)

Hope that clears it up, if not, feel free to ask!

ā€” Reply to this email directly, view it on GitHubhttps://github.com/IBCNServices/pyRDF2Vec/issues/250#issuecomment-1974745429, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A4JRIOPK7P2JTRCUYYEAY3LYWGL3LAVCNFSM6AAAAABDZSRTACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZUG42DKNBSHE. You are receiving this because you authored the thread.Message ID: @.***>