Closed samehkamaleldin closed 8 years ago
All of my question have been answered after visiting Section §5.1 from your EMNLP 2015 paper
Just be sure you answered the questions correctly, and in case anyone else stumbles upon this issue, the answers to your questions are these:
generalization
is NELL's encoding of type relationships, i.e., encoding that Barack Obama is a person would use the edge type generalization
.)-#-
is the feature separator.ANYREL:
is a prefix denoting the type of a particular feature (other prefixes are SOURCE:
, TARGET:
, and VECSIM:
).And, in general, you can also just look at the code that generates the feature matrix (the link goes to outputFeatureMatrix
, in case the line number changes in the future).
This is a part weights.tsv file generated by training LogisticRegressionModel
for features from dataset wordnet:WN18
for the relation [ _has_part
]
-__part_of- 5.084440955701115
-__hyponym-_has_part-__hypernym- 1.7542840209412862
ANYREL:-_@ANY_REL@- 1.3409053287843857
-_hypernym-_hyponym-_hyponym- 1.0788205509200834
-__part_of-_part_of-__part_of- 1.045527586769945
ANYREL:-_hypernym-_@ANY_REL@-_hyponym- 1.0418866564174691
-_member_of_domain_topic-__part_of-__hypernym- 1.0258116342225538
-_member_of_domain_topic-_has_part-__hypernym- 1.0258116342225538
ANYREL:-_has_part-_hypernym-@ANY_REL@-__hypernym- 1.0040495743253814
ANYREL:-__hyponym-_hypernym-@ANY_REL@-_hyponym- 0.944886623292448
I find it strange that the relation _has_part
itself appears in the feature set ! which I didn't expect to happen. According to what I understand the feature extraction doesn't take the targeted relation into account, but here it does.
And here how I generate the data and do the feature extraction
val SFE_SPECS =
"""
{
"type": "subgraphs",
"path finder": {
"type": "BfsPathFinder",
"number of steps": 2
},
"feature extractors": [
"PraFeatureExtractor",
"AnyRelFeatureExtractor"
],
"feature size": -1
}
"""
val relationName = "_has_part"
val negativeExampleSelector = new PprNegativeExampleSelector(params \ "negative instances", graph, outputter)
val data_with_negatives = negativeExampleSelector.selectNegativeExamples(data, possibleSources, possibleTargets)
val sfeGenerator = new NodePairSubgraphFeatureGenerator(
parse(SFE_SPECS) ,
relationName,
RelationMetadata.empty,
outputter
)
val trainingMatrix = sfeGenerator.createTrainingMatrix(data_with_negatives)
ANYREL
? what is the type of these features and what do they represent ?Consider the relation WriterWroteBook. Say you have two facts, (JK Rowling, wrote, Harry Potter 1)
, and (JK Rowling, wrote, Harry Potter 2)
, and that you know (Harry Potter 1, sequel, Harry Potter 2)
. The path that goes (JKR --- wrote ---> Harry Potter 1 --- sequel ---> Harry Potter 2)
is incredibly informative for predicting that JKR wrote HP2. Similarly, the path [wrote, _sequel]
is informative for predicting that JKR wrote HP1. The trouble is that if both of these are training examples, and you remove both of them when extracting features, you can't use either feature. So what I do is some fancy footwork in the code so that an edge is only excluded if it's the current training (or testing) edge, so the (JKR, HP1) example can use the (JKR, HP2) edge, and vice versa. So, yes, you will see features for _has_part
that contain _has_part
in a longer path, but this is not cheating, because it is using other known instances of _has_part
. If it was actually using the training or testing edge, you should see basically perfect accuracy from the classifier.
Why are some features are not prefixed with ANYREL
despite I'm using AnyRelFeatureExtractor
? what is the type of non-prefixed features ?
I expect that all features are paths starting from the source node but not necessary end at the target node, is that right ?
You are using both the AnyRelFeatureExtractor
and the PraFeatureExtractor
. Features that have no prefix come from the PraFeatureExtractor
, and are paths connecting the source node to the target node.
Also, it appears that _part_of
is the inverse of _has_part
. Is this true? If it is, you need to specify that, or your experiment will not be correct. The model will learn that when the inverse is present, it should predict an edge, and when the inverse isn't present, it won't predict an edge. Then, at test time, you'll either be cheating by using the inverse, or you'll predict nothing.
To fix this, either remove the _part_of
relation entirely (if it really just duplicates the _has_part
relation, there is no point to having it in the graph, it will just slow down the code and make learning harder), or specify the inverse relationship in the relation metadata. The first option is definitely preferred.
Your comments answers the questions, So I think this should be closed now
Hi @matt-gardner , I read the explanation above and it was clear, but in my weights.tsv there appears to be this "-@ANY_REL@-" simbol either inside or at end of a path feature, can you explain what does this symbol indicates? here is part of the weights.tsv: ANYREL:-phone_of_u1-phone_of_u1-_cert_of_u2-@ANY_REL@- 1.3428828694307187 ANYREL:-_u1_payto_u2-_u1_payto_u2-@ANY_REL@-cert_of_u1- 1.0217711921277646 ANYREL:-card_of_u1-_card_of_u1-cert_of_u2-@ANY_REL@- 0.8942525084728918 ANYREL:-_cert_ofu1-@ANY_REL@-card_of_u1-_u1_payto_u2- 0.7607560172498669 -_cert_of_u1-_cert_of_card-card_of_u1-_u1_payto_u2- 0.7607560172498669 ANYREL:-_cert_of_u1-_cert_of_card-card_ofu1-@ANY_REL@- 0.7607560172498669 ANYREL:-_cert_of_u1-_cert_of_card-@ANY_REL@-_u1_payto_u2- 0.7607560172498669 ANYREL:-@ANY_REL@-phone_of_u1-_dev_of_u1-dev_of_u1- 0.684886880419521 ANYREL:-@ANY_REL@-phone_of_u2-_cert_of_u2-cert_of_u1- 0.6784595039383237 ANYREL:-_cert_of_u2-@ANY_REL@-u1_payto_u2- 0.675731779371918 -geo_of_u1-geo_of_u1- 0.6612471349346711 ANYREL:-@ANY_REL@-geo_of_u1- 0.6612471349346711 ANYREL:-geo_of_u1-@ANY_REL@- 0.6612471349346711 ANYREL:-phone_of_u2-@ANY_REL@-card_of_u1- 0.6527712367035824 -phone_of_u2-phone_of_card-card_of_u1- 0.6527712367035824 ANYREL:-phone_of_u2-phone_of_card-@ANY_REL@- 0.6527712367035824 -_phone_of_u2-_phone_of_card-cert_of_card-_cert_of_u1- 0.64549966517749 ANYREL:-_phone_ofu2-@ANY_REL@-cert_of_card-_cert_of_u1- 0.64549966517749
See section 5.1 here: http://rtw.ml.cmu.edu/emnlp2015_sfe/paper.pdf.
I did manage to extract feature using the configuration file approach, and they are stored into two file
training_matrix.tsv
andtest_matrix.tsv
.I've extracted a random observation from the
training_matrix.tsv
file of different relations but I couldn't understand the feature representation what does it mean.this is an example feature from relation
concept:actorstarredinmovie
, the following is only one line observation I took it into a new file tried to decompose it into a set of feature to understand what features are like, but I couldn't-generalization
and_generalization
?1.0
the is redundant in the training observation ?-#-
?ANYREL:
that defines new feature ?If the following is a full valid feature (as I assume)
what does it represent ?