adumit / memory-augmented-transformers

Repo for testing an integration of Memorizing Transformers into pretrained models
7 stars 0 forks source link

Have you tested the code? #1

Open yunx-z opened 2 years ago

yunx-z commented 2 years ago

Hello,

Thanks for your nice implementation. Have you ever run your code on some dataset and get some interesting results? Or do you discover any bug in the current version? Thanks!

yunx-z commented 2 years ago

Also, if we want to compare memory-augmentated GPT with non-memory one, can we simply set finetune_mems=False? https://github.com/adumit/memory-augmented-transformers/blob/master/memory_augmented_transformers/run_experiment.py#L27

adumit commented 2 years ago

Hi, thanks for your interest! The code in the repo is a bit behind what I've been experimenting with, but I'm currently writing up my results - there have been some interesting results!

To your latter question, I would suggest you instead pass in an empty tuple for the layers_to_insert_memories argument. That will give you a better comparison. Setting finetune_mems to False will simply not update the weights associated with the memories, but will still use memories.

yunx-z commented 2 years ago

Another quick question: does this implementation consider Transformer-XL memories (besides kNN memories)?