LAMA is a probe for analyzing the factual and commonsense knowledge contained in pretrained language models.
LAMA contains a set of connectors to pretrained language models.
LAMA exposes a transparent and unique interface to use:
Actually, LAMA is also a beautiful animal.
The LAMA probe is described in the following papers:
@inproceedings{petroni2019language,
title={Language Models as Knowledge Bases?},
author={F. Petroni, T. Rockt{\"{a}}schel, A. H. Miller, P. Lewis, A. Bakhtin, Y. Wu and S. Riedel},
booktitle={In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019},
year={2019}
}
@inproceedings{petroni2020how,
title={How Context Affects Language Models' Factual Predictions},
author={Fabio Petroni and Patrick Lewis and Aleksandra Piktus and Tim Rockt{\"a}schel and Yuxiang Wu and Alexander H. Miller and Sebastian Riedel},
booktitle={Automated Knowledge Base Construction},
year={2020},
url={https://openreview.net/forum?id=025X0zPfn}
}
To reproduce our results:
(optional) It might be a good idea to use a separate conda environment. It can be created by running:
conda create -n lama37 -y python=3.7 && conda activate lama37
pip install -r requirements.txt
wget https://dl.fbaipublicfiles.com/LAMA/data.zip
unzip data.zip
rm data.zip
Install spacy model
python3 -m spacy download en
Download the models
chmod +x download_models.sh
./download_models.sh
The script will create and populate a _pre-trained_languagemodels folder. If you are interested in a particular model please edit the script.
python scripts/run_experiments.py
results will be logged in output/ and _lastresults.csv.
This repository also provides a script (scripts/create_lama_uhn.py
) to create the data used in (Poerner et al., 2019).
This repository also gives the option to evalute how pretrained language models handle negated probes (Kassner et al., 2019), set the flag use_negated_probes
in scripts/run_experiments.py
. Also, you should use this version of the LAMA probe https://dl.fbaipublicfiles.com/LAMA/negated_data.tar.gz
and use the vectors in your downstream task!
pip install -e git+https://github.com/facebookresearch/LAMA#egg=LAMA
import argparse
from lama.build_encoded_dataset import encode, load_encoded_dataset
PARAMETERS= {
"lm": "bert",
"bert_model_name": "bert-large-cased",
"bert_model_dir":
"pre-trained_language_models/bert/cased_L-24_H-1024_A-16",
"bert_vocab_name": "vocab.txt",
"batch_size": 32
}
args = argparse.Namespace(**PARAMETERS)
sentences = [
["The cat is on the table ."], # single-sentence instance
["The dog is sleeping on the sofa .", "He makes happy noises ."], # two-sentence
]
encoded_dataset = encode(args, sentences)
print("Embedding shape: %s" % str(encoded_dataset[0].embedding.shape))
print("Tokens: %r" % encoded_dataset[0].tokens)
# save on disk the encoded dataset
encoded_dataset.save("test.pkl")
# load from disk the encoded dataset
new_encoded_dataset = load_encoded_dataset("test.pkl")
print("Embedding shape: %s" % str(new_encoded_dataset[0].embedding.shape))
print("Tokens: %r" % new_encoded_dataset[0].tokens)
You should use the symbol [MASK]
to specify the gap.
Only single-token gap supported - i.e., a single [MASK]
.
python lama/eval_generation.py \
--lm "bert" \
--t "The cat is on the [MASK]."
source: https://commons.wikimedia.org/wiki/File:Bluebell_on_the_phone.jpg
Note that you could use this functionality to answer cloze-style questions, such as:
python lama/eval_generation.py \
--lm "bert" \
--t "The theory of relativity was developed by [MASK] ."
Clone the repo
git clone git@github.com:facebookresearch/LAMA.git && cd LAMA
Install as an editable package:
pip install --editable .
If you get an error in mac os x, please try running this instead
CFLAGS="-Wno-deprecated-declarations -std=c++11 -stdlib=libc++" pip install --editable .
Option to indicate which language model(s) to use:
BERT pretrained models can be loaded both: (i) passing the name of the model and using huggingface cached versions or (ii) passing the folder containing the vocabulary and the PyTorch pretrained model (look at convert_tf_checkpoint_to_pytorch in here to convert the TensorFlow model to PyTorch).
options:
example considering both BERT and ELMo:
python lama/eval_generation.py \
--lm "bert,elmo" \
--bmd "pre-trained_language_models/bert/cased_L-24_H-1024_A-16/" \
--emd "pre-trained_language_models/elmo/original/" \
--t "The cat is on the [MASK]."
example considering only BERT with the default pre-trained model, in an interactive fashion:
python lamas/eval_generation.py \
--lm "bert" \
--i
python lama/get_contextual_embeddings.py \
--lm "bert,elmo" \
--bmn bert-base-cased \
--emd "pre-trained_language_models/elmo/original/"
The intersection of the vocabularies for all considered models
If the module cannot be found, preface the python command with PYTHONPATH=.
If the experiments fail on GPU memory allocation, try reducing batch size.
(Kassner et al., 2019) Nora Kassner, Hinrich Schütze. Negated LAMA: Birds cannot fly. arXiv preprint arXiv:1911.03343, 2019.
(Poerner et al., 2019) Nina Poerner, Ulli Waltinger, and Hinrich Schütze. BERT is Not a Knowledge Base (Yet): Factual Knowledge vs. Name-Based Reasoning in Unsupervised QA. arXiv preprint arXiv:1911.03681, 2019.
(Dai et al., 2019) Zihang Dai, Zhilin Yang, Yiming Yang, Jaime G. Carbonell, Quoc V. Le, and Ruslan Salakhutdi. Transformer-xl: Attentive language models beyond a fixed-length context. CoRR, abs/1901.02860.
(Peters et al., 2018) Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. NAACL-HLT 2018
(Devlin et al., 2018) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
(Radford et al., 2018) Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.
(Liu et al., 2019) Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv preprint arXiv:1907.11692.
LAMA is licensed under the CC-BY-NC 4.0 license. The text of the license can be found here.