Closed dschaehi closed 4 years ago
Hi,
Thanks for the feedback, glad it's proving useful for your research. And sorry for the essentially missing code documentation... :-/
Good question, this is indeed not possible in a super-straightforward way. The cleanest solution is to stitch together a new Captioner class, which would do something like the RegularTypeCaptioner and RegularAttributeCaptioner. If you don't need this part of the sentence to be made incorrect (say, for incorrect statements you always want the relation between two objects to be incorrect), it can be simplified to look more like the EmptyTypeCaptioner. After implementing this, you can just swap the RegularTypeCaptioners in the RelationalDataset with this captioner. Some relations won't be fully compatible with this, however (like "same shape as", since if both entities are referred to by shape, this is a trivial statement).
Does that help? If you're not sure about creating your own captioner, I can try to put it together from the existing captioners code, however, I would ideally leave the testing to you, since I don't have all this installed anymore (I've finished my PhD a year ago).
Hi @AlexKuhnle ,
Thanks for the quick reply! If it is not a too much effort for you to create the new captioner, then it would be great if you could put it together.
By the way, only after digging deeper into the codebase of ShapeWorld I realized how complex such a synthetic VQA dataset generator can be, particularly because you need to maintain the logical truths of the question-answer pairs while making the questions interesting. It is easy to underappreciate such a job.
Okay, I will give it a go.
Yes, I certainly underestimated the work when I started. :-) It made me appreciate in how many different ways flaws may be accidentally introduced into even a synthetic abstract dataset. If you're interested in reading more about this in the context of ShapeWorld, have a look at chapter 4 (and 5.3) of my thesis.
Thanks. That'll be cool.
Indeed I had to read chapter 4 of your thesis while trying to understand roughly what is going on in the code (e.g., to check whether any biases are introduced in the dataset generation process). Without the chapter it would have been difficult.
Ah, alright... somewhat hidden code documentation, admittedly. :-)
Have a look at this branch. I added a SingleAttributeTypeCaptioner
and a RelationalSingleAttributeDataset
, which hopefully does what you want -- see the type_single_attribute
argument of the dataset. I haven't tested this so far, but hopefully it works or only contains easy-to-fix typos or so. Note that the set of relations is restricted to spatial relations (is that all you're interested in?), because I think only using one attribute won't be compatible with the other relations.
Thanks for a adding the feature so quickly. There was only one missing argument that I had to provide (see my pull request https://github.com/AlexKuhnle/ShapeWorld/pull/21#issue-481181450).
Yes, I am interested in spatial relations, as my current research is focused on learning spatial relations. As soon as I have a paper out of it, I can share it with you (and will put you in the acknowledgement).
First of all, thanks for this nice package! I am currently actively using it as a test bed for visual reasoning models. One question I have at the moment is how to control what predicates to include in the captions. For example I'd like to turn off the
color
predicate when doing relational reasoning. A hacky solution what I am using now is replacing https://github.com/AlexKuhnle/ShapeWorld/blob/3bfcfab5e745313c83720fd977c1212557c62996/shapeworld/captioners/captioner.py#L54 withand then replacing in
shapeworld/captioners/relation.py
with
But I assume there is a better way to solve this problem, e.g., by providing a child class of CaptionAgreementDataset. But at the moment I don't know how (I spent already quite a bit of time trying to make sense of the code...)
Could you give me some tips for this problem?