Glaciohound / LERP

Official Repository for ICLR 2023 Paper: Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning
MIT License
11 stars 5 forks source link

How to interprete generated rules #3

Closed Aqudi closed 1 year ago

Aqudi commented 1 year ago

Hi, I tried to generate rules after training with the LERP training script you uploaded. I got the following results, can you tell me how to interpret them?

Also, after training on WN18 and WN18RR datasets, how can I check the trained rules?

script:

for i in range(option.rank):
    print(learner.lerp.interpret_rule(j, i, [*data.relation_to_number.keys()]))

for i in range(option.width):
    print(learner.lerp.interpret_lerp(-1, i,  [*data.relation_to_number.keys()]))

generated rules:

1 brother 
(1.6279151624268816e-15, 'X(AND aunt^T)--daughter--father^T(AND Not(son^T))') 
(2.116390064830938e-20, 'X(AND uncle)--brother^T(AND daughter^T)--nephew(AND sister)') 
(9.526430003514366e-17, 'X(AND Not(aunt^T))--wife--wife^T(AND Not(mother))--sister^T') 
(1.3220911569078453e-05, 'X--daughter^T--nephew') 
(1.6444326225356957e-15, 'X(AND Not(mother))--nephew^T(AND uncle--niece)') 
(1.6161735402420163e-05, 'X--uncle--daughter') 
(4.633622290229241e-19, 'X(AND Not(wife))--father(AND Not(nephew))(AND Not(son))') 
(6.584387202135389e-11, 'X(AND And(son, husband))') 
(4.159390922901551e-23, 'X(AND And(mother, uncle^T))--mother(AND Not(uncle))(AND Not(son^T))--sister^T') 
(5.01396240336089e-14, 'X(AND Not(son^T))(AND sister--nephew^T)') 

If you can tell me how to extract the trained rules and interpret them, it will be very helpful for me to find the rules in the dataset.

Glaciohound commented 1 year ago

Hi. Thanks for your interests in our work!

Congrats, seems you are using the interpret_rule function correctly. Each line outputted by function interpret_rule is a compact representations for the learned rules. As we visualized in Figure 7 of the paper, each rule is a relation chain with logical functions attached to each node in the chain.

Therefore, in each line you can see a "backbone" formed by --, which represent that "relation chain". Lets take the 4-th line as example which has a higher prob. There X--daughter^T--nephew is such a relation chain, with X being the first variable in the chain. If the relation is marked with ^T, it means a reversed relation. More precisely, this line's chain can be written as $\text{daughter}(y, x) \land \text{nephew}(z, y)$. So if $y$ is the daughter of $x$ and nephew of $z$, then $z$ is the brother of $x$.

All (AND ...) components then mean logical functions applied to each node. They are written in a recursive manner, following the same format to the main chain. If multiple exists for the same node, as in the last line, then these two functions are applied to the same node. This usually happens when a relation link in the main chain is an identity function, e.g., $\text{equal}(\cdot, \cdot)$, so that the logical functions applied to the two nodes actually refer to the same entity variable.

Logical operators such as And (this is different from capitalized AND!) and Not mean what they are supposed to mean. For example, (AND Not(wife)) means a function $\not\exists z, \text{wife}(z, \cdot)$.

Please note that, LERP is in design a continuous model, with extensive mixed operators. This is why the confidence score are uniformly low, which basically means that the LERP also allocates weights to other possibilities. You might observe higher scores if the model size is small, and lower scores if the model size is large. Each interpreted rule is actually greedily selected, so they cannot fully represent the computation of the actual rule and is only a maximal approximate. Some interpreted rules are not totally right (like the 4th line, where a daughter actually cannot be a nephew), but the model might be mixing multiple things and results in such a interpretation, or simply trying to be compatible with annotation randomness.

The Bridged_LerpModel is actually more complex, mainly designed for representing more complicated rules in a compact size. This is why I have not written an interpretation function for it. However you are free to manually inspect its weights or propose a good solution for it!

Aqudi commented 1 year ago

Thank you!!! It was very helpful