Due to its large size, the data is hosted outside Github: https://nlp.stanford.edu/projects/phrasenode/
You can download the dataset by running the script as follows.
bash download_dataset.sh
The code was developed in the following environment:
To install dependencies:
(Optional) Create a virtualenv / conda environment
virtualenv.py -p python2.7 env
source env/bin/activate
Python dependencies
sudo apt-get install python-dev
pip install -r requirements.txt
Alternatively, use the docker image ppasupat/phrasenode
For latest image: docker pull ppasupat/phrasenode:1.06
If you just want to see something happen:
export WEBREP_DATA=./data
./main.py configs/base.txt configs/model/encoding.txt configs/node-embedder/allan.txt -n testrun
main.py
with three config files.-c
option. These are applied last.-n
specifies the experiment directory name.Here are the configerations used in the final experiments:
base.txt
: Used in all experimentsmodels/encoding.txt
: The embedding-based methodmodels/alignment.txt
: The alignment-based methodnode-embedder/allan.txt
: The node embedder as described in the paperablation/*.txt
: AblationNote that the visual neighbor is off by default.
To turn it on, use general/neighbors.txt
.
All training runs are managed by the PhraseNodeTrainingRuns
object. For example,
to get training run #141, do this:
runs = PhraseNodeTrainingRuns() # Note the final "s"
run = runs[141] # a PhraseNodeTrainingRun object
A TrainingRun
is responsible for constructing a model, training it, saving it
and reloading it (see superclasses gtd.ml.TrainingRun
and
gtd.ml.TorchTrainingRun
for details.)
The most important methods on PhraseNodeTrainingRun
are:
__init__
: the model, data storage, etc, are initializedtrain
: actual training of the model happens hereStatistics are logged to TensorBoard. To view:
tensorboard --logdir=data/experiments
Start the server with
export WEBREP_DATA=./data
./server.py data/experiments/0_testrun/config.txt -m data/experiments/0_testrun/checkpoints/20000.checkpoint/model
where 0_testrun
should be changed to the model's directory, and 20000
should be changed to the checkpoint number you want.
Install the unpacked Chrome extension in demo/phrasenode-demo
demo/phrasenode-demo
)Panupong Pasupat, Tian-Shun Jiang, Evan Liu, Kelvin Guu, Percy Liang.
Mapping natural language commands to web elements.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
CodaLab: https://worksheets.codalab.org/worksheets/0x0097f249cd944284a81af331093c3579/