facebookresearch / nle

The NetHack Learning Environment
Other
939 stars 114 forks source link

Glyph ID handling #21

Open heiner opened 4 years ago

heiner commented 4 years ago

Currently, each dungeon tile (ignoring the char/color/specials observation that's also available) is an int16 between 0 and nethack.MAX_GLYPH == 5976. We use an embedding lookup table of that size embedding_dim == 32. That's 5976 * 32 == 191232 floating points, or 191232 * 16 == 3059712 bits, or ~0.3MB. That doesn't seem too much but there's some issues with the embedding itself. Also, it does not give the agent a cue that certain ids (e.g., dog and large dog) are more related than others (large dog vs wall).

The way these glyphs are organized is that first come all the monsters (NUMMONS many, which is 381), then pets (again NUMMONS many because in theory every monster can be tame, then a single glyph for an invisible monster (GLYPH_INVIS_OFF, which is 762), then a glyph for each "detected" monster (again NUMMONS many). For some obscure reason, then there's corpses, which are not monsters (but there's NUMMONS many), and then there's ridden monsters, which are monsters (NUMMONS many). The check glyph_is_monster(glyph) does this:

#define glyph_is_monster(glyph)                            \
    (glyph_is_normal_monster(glyph) || glyph_is_pet(glyph) \
     || glyph_is_ridden_monster(glyph) || glyph_is_detected_monster(glyph))

This makes a list like [i for i in range(nethack.MAX_GLYPH) if nethack.glyph_is_monster(i)] have length nethack.NUMMONS*4 == 1524, but it's not contiguous.

Cf. https://github.com/fairinternal/NetHack/blob/rl/win/rl/helper.cc#L37 for a list of the offsets and take a look at the comment in https://github.com/fairinternal/NetHack/blob/rl/include/display.h#L235 explaining this.

After monsters there's MAXPCHARS == 96 cmap entries for dungeon features, then there's zap beams (NUM_ZAP << 2 == 8 << 2 == 32 many). Then there's NUMMONS << 3 == 3048 (!) "swallow" glyphs. That's a lot for stuff that basically never happens to our agents. Then there's WARNCOUNT == 6 warning glyphs and finally NUMMONS statue glyphs.

As a graphic representation, the glyph ids are:

MMMMMMPPPPPPDDDDDD%%%%%RRRRRROOOOOOOCXZSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSTTTTTT
MonstsPets--DetectBody-RiddenObjectsCXZSwaaaaaaalllllllllllooooooooooowwww-----------Statue

Where

glyph_labels = {
    GLYPH_MON_OFF: "M",  # 6.38%
    GLYPH_PET_OFF: "P",  # 6.38%
    GLYPH_INVIS_OFF: " ",  # 0.02%
    GLYPH_DETECT_OFF: "D",  # 6.38%
    GLYPH_BODY_OFF: "%",  # 6.38%
    GLYPH_RIDDEN_OFF: "R",  # 6.38%
    GLYPH_OBJ_OFF: "O",  # 7.58%
    GLYPH_CMAP_OFF: "C",  # 1.46%
    GLYPH_EXPLODE_OFF: "X",  # 1.05%
    GLYPH_ZAP_OFF: "Z",  # 0.54%
    GLYPH_SWALLOW_OFF: "S",  # 51.00%
    GLYPH_WARNING_OFF: "W",  # 0.10%
    GLYPH_STATUE_OFF: "T",  # 6.38%
    MAX_GLYPH: "-",
}

More than half of all glyph ids are swallow!

We should rethink the featurization of the glyph ids.

aleSuglia commented 3 years ago

Hey @heiner, I'm changing the original agent implementation and I was thinking to use a different embedding representation. Is this issue still valid? I saw that the PyTorch people closed the issue on their side.

heiner commented 3 years ago

The PyTorch issue has to do with the speed of embeddings and is more of an aside.

This issue here describes the fact that glyph ids are not great for ML necessarily (e.g., over half of all glyph ids are of type "swallow", 99% of which will never show up in the actual game).

We are experimenting with ways to preprocess these glyphs in our agent code.

aleSuglia commented 3 years ago

Thanks for clarifying this. So I assume this doesn't have an effect on the NeurIPS code release right?

dmadeka commented 3 years ago

Is there a mapping between the glyph id and the monster name? It would help to featurize attributes of the monster

heiner commented 3 years ago

@dmadeka: Take a look at https://github.com/facebookresearch/nle/blob/master/nle/tests/test_nethack.py#L185

dmadeka commented 3 years ago

Amazing - thank you!

Is there a doc for whats exposed through the FFI? (Im guessing you alls are using pybind11)

heiner commented 3 years ago

Not documentation per se but the actual source code shouldn't be too hard to read: https://github.com/facebookresearch/nle/blob/master/win/rl/pynethack.cc#L502

dmadeka commented 3 years ago

no - it isnt too hard to read! Thanks!