AlexKuhnle / ShapeWorld

MIT License
58 stars 18 forks source link

Controlling what predicates to include in the captions #20

Closed dschaehi closed 4 years ago

dschaehi commented 4 years ago

First of all, thanks for this nice package! I am currently actively using it as a test bed for visual reasoning models. One question I have at the moment is how to control what predicates to include in the captions. For example I'd like to turn off the color predicate when doing relational reasoning. A hacky solution what I am using now is replacing https://github.com/AlexKuhnle/ShapeWorld/blob/3bfcfab5e745313c83720fd977c1212557c62996/shapeworld/captioners/captioner.py#L54 with

return self.sample_values(mode=mode, predication=LogicalPredication(blocked_preds=["color"]))

and then replacing in shapeworld/captioners/relation.py

ref_predication = predication.copy(reset=True)

with

ref_predication = predication.copy(reset=False)

But I assume there is a better way to solve this problem, e.g., by providing a child class of CaptionAgreementDataset. But at the moment I don't know how (I spent already quite a bit of time trying to make sense of the code...)

Could you give me some tips for this problem?

AlexKuhnle commented 4 years ago

Hi,

Thanks for the feedback, glad it's proving useful for your research. And sorry for the essentially missing code documentation... :-/

Good question, this is indeed not possible in a super-straightforward way. The cleanest solution is to stitch together a new Captioner class, which would do something like the RegularTypeCaptioner and RegularAttributeCaptioner. If you don't need this part of the sentence to be made incorrect (say, for incorrect statements you always want the relation between two objects to be incorrect), it can be simplified to look more like the EmptyTypeCaptioner. After implementing this, you can just swap the RegularTypeCaptioners in the RelationalDataset with this captioner. Some relations won't be fully compatible with this, however (like "same shape as", since if both entities are referred to by shape, this is a trivial statement).

Does that help? If you're not sure about creating your own captioner, I can try to put it together from the existing captioners code, however, I would ideally leave the testing to you, since I don't have all this installed anymore (I've finished my PhD a year ago).

dschaehi commented 4 years ago

Hi @AlexKuhnle ,

Thanks for the quick reply! If it is not a too much effort for you to create the new captioner, then it would be great if you could put it together.

By the way, only after digging deeper into the codebase of ShapeWorld I realized how complex such a synthetic VQA dataset generator can be, particularly because you need to maintain the logical truths of the question-answer pairs while making the questions interesting. It is easy to underappreciate such a job.

AlexKuhnle commented 4 years ago

Okay, I will give it a go.

Yes, I certainly underestimated the work when I started. :-) It made me appreciate in how many different ways flaws may be accidentally introduced into even a synthetic abstract dataset. If you're interested in reading more about this in the context of ShapeWorld, have a look at chapter 4 (and 5.3) of my thesis.

dschaehi commented 4 years ago

Thanks. That'll be cool.

Indeed I had to read chapter 4 of your thesis while trying to understand roughly what is going on in the code (e.g., to check whether any biases are introduced in the dataset generation process). Without the chapter it would have been difficult.

AlexKuhnle commented 4 years ago

Ah, alright... somewhat hidden code documentation, admittedly. :-)

AlexKuhnle commented 4 years ago

Have a look at this branch. I added a SingleAttributeTypeCaptioner and a RelationalSingleAttributeDataset, which hopefully does what you want -- see the type_single_attribute argument of the dataset. I haven't tested this so far, but hopefully it works or only contains easy-to-fix typos or so. Note that the set of relations is restricted to spatial relations (is that all you're interested in?), because I think only using one attribute won't be compatible with the other relations.

dschaehi commented 4 years ago

Thanks for a adding the feature so quickly. There was only one missing argument that I had to provide (see my pull request https://github.com/AlexKuhnle/ShapeWorld/pull/21#issue-481181450).

Yes, I am interested in spatial relations, as my current research is focused on learning spatial relations. As soon as I have a paper out of it, I can share it with you (and will put you in the acknowledgement).