Closed hughperkins closed 6 years ago
oh, because I didnt read the doc ( https://github.com/AlexKuhnle/ShapeWorld#integration-into-python-code , bottom of this section, have to set alternatives=True, for anyone else who arrives here).
Hey @hughperkins, glad you like the project! Yes, that's right. A warning that I haven't worked so much with the multiple images per description mode. It should produce interesting instances but, if I remember correctly, performance of the (so far unoptimized) generation process in this case may not be very good, depending on the dataset choice.
Awesome, thanks!
Question: how to restrict the prepositions used? I tried:
dataset = Dataset.create(
dtype='agreement', name='relational', worlds_per_instance=6, negation=False,
relations=('to the right of', 'below', 'above', 'to the left of')
)
... but that didnt seem to be the right approach?
Definitely requires a/more documentation... :-)
You can find valid values e.g. in the language file and this gives you an idea how to specify the restriction argument.
Awesome, thanks! :) The method at https://github.com/AlexKuhnle/ShapeWorld/blob/master/configs/agreement/relational/spatial_twoshapes.json#L4 does exactly what I need :)
Did you delete some comments here? :-) Regarding your one question: You were about right regarding the reason for slow performance, the implementation so far works in a way that favors multiple captions per image, but is less efficient for multiple images per captions, and I haven't implemented an efficient option for the latter alternative (quite a bit of work). 'existential'
should be fast either way. Regarding the other question: It's currently probabilistic, so no way to guarantee specific correct/incorrect numbers, but that would not be too difficult to integrate.
Did you delete some comments here? :-)
Haha, might have :)
Regarding your one question: You were about right regarding the reason for slow performance, the implementation so far works in a way that favors multiple captions per image, but is less efficient for multiple images per captions, and I haven't implemented an efficient option for the latter alternative (quite a bit of work). 'existential' should be fast either way. Regarding the other question: It's currently probabilistic, so no way to guarantee specific correct/incorrect numbers, but that would not be too difficult to integrate.
Ok, sounds good. I've found a way to workaround these issues. So, I have everything I need, and just need to try to come up with some novel approach to leaning the dataset now :)
Good to hear it all works now. If you're happy to share what you're working on, results or so, at some point, feel free to do so (via mail) -- I'm curious. :-)
Hi, awesome project :)
Looking around the examples, eg https://rawgit.com/AlexKuhnle/ShapeWorld/master/examples/agreement/relational-full/data.html it looks like the way it works is that one image is generated, and then multiple descriptions are created for this image?
Is there any way to do the opposite, ie sample one description, and then draw multiple examples that match that description? (and also ideally, some examples that are guaranteed to not match the description)
Edit: I tried using
worlds_per_instance
, but that didnt seem to be it?