Add memory to transformer

@louismartin Hello! thanks for sharing this work I am working on new transformer architecture and I want to try PKM so reading the instruction and the repo I have a simple question. Regarding "mem_enc_positions" and "mem_dec_positions" in the read me page was explained:

"To add a memory in (for instance) the layers 4 and 7 of an encoder, you can simply provide --use_memory true --mem_enc_positions 4,7"

but in this line we see

for layer_id, pos in mem_positions:

according to that, I understand that recovered mem_positions should return a tuple(int,str)= (layer_id, pos) and not one value according to the explanation above 4 and 7, are layers, so I was wondering what is the right syntax of "mem_enc_positions" and "mem_dec_positions" parameters.

facebookresearch / XLM

Add memory to transformer #340