AlexKuhnle / ShapeWorld

MIT License
58 stars 18 forks source link

Size-rel in Selection Dataset #23

Closed furkanbiten closed 4 years ago

furkanbiten commented 4 years ago

Hi Alex,

I have seen some examples for size relation in the Selection dataset, however, in the language json, there seems to be no key for these.

Was there a reason you left out the size-rel in selection dataset? If not, would it possible for you to share it if it is not too much trouble of course.

AlexKuhnle commented 4 years ago

Hey,

Very good question. I need to have a look whether there was a reason I removed it.

AlexKuhnle commented 4 years ago

Clearly they have been removed in the last change for that file. Now the only question is why... :-)

AlexKuhnle commented 4 years ago

If you don't mind, just try adding these lines again and see whether it works. If you can't find any problems, we could add it again via PR (unless I can think of the reason for removing them).

furkanbiten commented 4 years ago

Ok, I managed to make it work but that took some digging and I am pretty sure this is not the solution at all.

First of all, when I update the english.json with shade and size selectors, the code hangs forever and the problem was on sample_values. I think but not sure you closed off the size and shade selectors in there.

The problem as far as I could identify was this for loop and not just the first part of ifs. I tried to delete this from the if -> self.incorrect_predtype in ('size-two', 'size-max') but no luck.

So, I went into commit history for the selector.py and when I manually change the sample_values function to previous commited version, everything works!

NOW, to quote you: "Now the only question is why... :-)"

AlexKuhnle commented 4 years ago

I've had a look at it, and in my latest commit may have fixed the problem (plus added the constructions back to the language file). The motivation behind these lines:

If we have a size-related predtype (self.predtype in ('size-two', 'size-max')), then either subject or object should mention the corresponding shape, since size comparisons are only valid between the same shape (not predication.redundant(predicate='shape') and not scope_predication.redundant(predicate='shape')), plus we may want to prevent redundant specifications of the attribute (not self.logical_redundancy and predication.redundant(predicate='shape') and scope_predication.redundant(predicate='shape')).

The last bit is, I think, unnecessary, since it says that, if we want to avoid contradictions, we don't want to talk about size if the shape attribute is blocked by scope, means it is the target to be made incorrect for an incorrect scope description (not self.logical_contradiction and scope_predication.blocked(predicate='shape')). However, while I think it was possible at some point, it caused problems and so the option that the scope of a selector is made incorrect was removed, so the check may not be necessary.

Why is it problematic? If we have "The biggest square" and turn it into "The biggest circle", but there is no circle, this is not correct English, since a statement like "the X" presupposes the existence and makes a (possibly false) statement about something on top of that. Moreover, even if we have just one circle, using "biggest" is slightly inappropriate, since again it presupposes more than one target.

I'm not entirely sure whether there was another subtlety related to the fact that every "noun phrase"/"object description" should, in principle, be possible to be made incorrect, but I don't see (anymore) why this would be a problem here. Since this last commit for selector.py happened in the last weeks of my PhD, it may have also just been due to some confused thoughts about semantics. :-D

furkanbiten commented 4 years ago

First off, thanks for this explanation, it helps a lot!

To summarize, you are saying that if I close/comment the "if statement" (self.predtype in ('size-two', 'size-max') and not predication.redundant(predicate='shape') and not scope_predication.redundant(predicate='shape') and ....) for size selectors in these lines, then I will get sentences that can compare between different shapes (comparison of circle to rectangle) and as well can produce sentences with redundant specifications of attribute, especially this won't create a known problem if you want to produce only correct sentences (I am trying to summarize what I understand).

Since this last commit for selector.py happened in the last weeks of my PhD, it may have also just been due to some confused thoughts about semantics. :-D

Hopefully, I will relate to this soon enough :)

PS: Will be closing the issue since its main purpose is fullfilled

AlexKuhnle commented 4 years ago

Redundancy is allowed anyway by default, so sentences like "The bigger square is a red square." should be possible (not even sure whether this check is needed here or whether it's 'redundant' ;-). Selector sentences shouldn't produce wrong sentences like "The bigger square is a circle.", but this is not possible unless logical contradictions are allowed (false by default). So I think the only restriction which you would remove by commenting this line is that it would produce sentences like "The biggest shape is a green shape.". The semantics of such sentences are a bit problematic due to the conservative definition of "bigger" in SW: comparisons need to be the same shape and smaller to be true, they need to be clearly smaller w.r.t. area (by a margin) to be false, and undefined otherwise (that is, if an object is bigger area-wise but of a different shape). This is because when comparing different shapes for size, it's under-defined whether we look for "diameter" or "area" (e.g. compare square and cross, the former wins in area, the latter in diameter), bit messy.

In short, as far as I can tell, the only sentences this line prevents (after my change) are a few generic sentences which are better avoided, but obviously you can try it out and see what happens.

Hopefully, I will relate to this soon enough :)

Good luck! :-)

furkanbiten commented 4 years ago

Sorry for the late reply.

Thanks for the explanations and the nice wish :)