tobyli / Screen2Vec

Screen2Vec is a new self-supervised technique for generating more comprehensive semantic embeddings of GUI screens and components using their textual content, visual design and layout patterns, and app meta-data.
Other
62 stars 10 forks source link

Screen2Vec

Screen2Vec is a new self-supervised technique for generating more comprehensive semantic embeddings of GUI screensand components using their textual content, visual designand layout patterns, and app meta-data. Learn more about Screen2Vec in our CHI 2021 paper.

This repository houses the files necessary to implement the Screen2Vec vector embedding process on screens from the RICO dataset.

Steps to run full pipeline

  1. Extract vocab from the full RICO traces dataset and pre-write its Sentence-BERT embeddings (UI_embedding/extract_vocab.py).
  2. Train UI embedding model (UI_embedding/main.py) on RICO traces dataset.
  3. Use UI_embedding/modeltester.py to test the performance.
  4. Use UI_embedding/precompute_embeddings to write files of the UI components and app descriptions and their respective embeddings; do this for both the test and train components of the dataset.
  5. Train the layout autoencoder model using the RICO screens dataset using layout.py and then precompute the layout embeddings for your dataset using write_layout.py.
  6. Train the Screen2Vec embedding model using main_preloaded.py.
  7. Use modeltester.py to test it out.

In-depth


Setup

The code was developed in the following environment:

To install dependencies:

Data

Due to its large size, the data is hosted outside Github: the data is stored at http://interactionmining.org/rico in the "Interaction Traces" and "UI Screenshots and View Hierarchies" datasets. To easily download it, run

./download_data.sh

The pretrained models are stored at http://basalt.amulet.cs.cmu.edu/screen2vec/. Download them by running

./download_models.sh

The model labelled "UI2Vec" is the GUI element embedding model, "Screen2Vec" is the screen embedding model, "layout_encoder" is the screen layout autoencoder, and "visual_encoder" is our visual autoencoder baseline.

Quick start

If you just want to embed a screen using our pretrained GUI element, layout, and screen embedding models, run:

python get_embedding.py -s <path-to-screen> -u "UI2Vec_model.ep120" -m "Screen2Vec_model_v4.ep120" -l "layout_encoder.ep800"

This generates the vector embedding using our pretrained models. The parameters are:

Training

There are 2-3 steps for training your own screen embedding- first training the GUI element model, optionally training the screen layout autoencoder, and then the screen level model

GUI model

Files for the GUI model are contained within the UI_embedding directory. To train a model, run from within that directory:

python extract_vocab.py -d <location-of-dataset>

python main.py -d <location-of-dataset> -o <desired-output-path> -b 256 -e 100 -v "vocab.json" -m "vocab_emb.npy" -n 16 -r 0.001 -l "cel"

The parameters here are:

Then, to pre-generate these embeddings for your dataset to then use in screen training, run:

python precompute_embeddings.py -d <location-of-dataset> -m <desired-ui-model> -p <desired-prefix>

Then, move the files generated here from the UI_embedding directory into the Screen2Vec directory.

Layout autoencoder

The autoencoder is trained from within the Screen2Vec directory, by running:

python layout.py -d <location-of-screen-dataset> -b 256 -e 400 -r 0.001

where -b flags the batch size, -e the number of epochs, and -r the learning rate. Here, use the screen dataset ("combined") rather than the trace dataset ("filtered_traces").

Then, to pre-generate these embeddings for your dataset to then use in screen training, run

python write_layout.py -d <location-of-dataset> -m <layout-model> -p <same-desired-prefix>

Here, make sure you use the same desired prefix as from precomputing the UI embeddings.

Screen model

To train the Screen level model, run

python main_preloaded.py -d <previously-chosen-prefix> -s 128 -b 256 -t <ui-output-prefix>

The parameters here are:

Evaluation

There are files to evaluate the performance of both the GUI and Screen embeddings

To test the prediction accuracy of the GUI embedding, run (from within the UI_embedding directory)

python modeltester.py -m "UI2Vec_model.ep120" -v "vocab.py" -ve "vocab_emb.npy" -d <path-to-dataset>

To test the prediction accuracy of the Screen embedding, run (from within the main directory)

python modeltester_screen.py -m "Screen2Vec_model_v4.ep120" -v 4 -n 4

where -m flags the model to test, -v the model version (4 is standard), and -n the number of predictors used in predictions

Other files

The following files were used in the testing infrastructure of our baseline models. These are not needed for general use in the Screen2Vec pipeline and therefore have been stored in the sub-directory baseline to avoid clutter/confusion. However, if you desire to run these scripts, they should be moved to the main directory:

Reference

Toby Jia-Jun Li, Lindsay Popowski, Tom M. Mitchell, and Brad A. Myers. Screen2Vec: Semantic Embedding of GUI Screens and GUI Components. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI 2021).