facebookresearch / EGG

EGG: Emergence of lanGuage in Games
MIT License
290 stars 101 forks source link

Make a standard <n_attributes x n_values> datasets #63

Closed eugene-kharitonov closed 4 years ago

eugene-kharitonov commented 4 years ago

Datasets representing a cartesian product of n_attributes having n_values are used quite often. We could provide a general implementation.

eugene-kharitonov commented 4 years ago

For starters, can be as simple as itertools.product(*(range(n_values) for _ in range(n_attributes))

tomekkorbak commented 4 years ago

@eugene-kharitonov out of curiosity, what is wrong with egg.zoo.objects_game.features.VectorsLoader? Do you see any improvements that can be made? Or is it too complex? https://github.com/facebookresearch/EGG/blob/f6660df861c5f5999dc93a5cb54bed87d3d4810e/egg/zoo/objects_game/features.py#L15

robertodessi commented 4 years ago

It could be improved. For instance, VectorsLoader doesn't use binary/one hot vectors which could be better (no concept of greater distance between values of the same attributes). I was planning to refactor the code and add some features like probabilities of values for the attributes (https://github.com/robertodessi/EGG/blob/fix/egg/zoo/objects_game/features.py#L80) but it wasn't on top of my todo list

eugene-kharitonov commented 4 years ago

Hello @tomekkorbak @robertodessi ,

My intention was to have something standardised somewhere in a dedicated place - and use it in all games that require this kind of inputs. I didn't have in mind any issues with this piece of code.

Now looking at it, I think we could use something more general though, not assuming a discrimination game with distractors? (If needed, distractors can be added on top?)