Recipe summarization
This repo implements a sequence-to-sequence encoder-decoder using Keras to summarize recipe instructions by predicting a recipe title. This code is based on Siraj Raval's How to Make a Text Summarizer; it won the coding challenge for that week's video, and was featured in the following week's video.
This repo has been updated since then, so please check out tag v1.0.0
to view the version associated with the coding challenge. Lastly, note that this repo is not being actively maintained -- I will do my best to respond to any issues opened but make no guarantees.
New: If you're looking to serve your trained model and make its predictions accessible to others, I'd recommend looking into ServeIt, an open source library that lets you serve model predictions from a RESTful API using your favorite Python ML library in as little as one line of code. This repository includes an example recipe summarizer server in src/server.py
for your reference.
Data
I scraped 125,000 recipes from various websites for training (additional details can be found here). Each recipe consists of:
- A recipe title
- A list of ingredients
- Preparation instructions
- An image of the prepared recipe (missing for ~40% of recipes collected)
The model was fitted on the recipe ingredients, instructions and title. Ingredients were concatenated in their original order to the instructions. Recipe images were not used for this model.
Training
This model was trained for ~6 hours on an nVidia Tesla K80. Training consisted of several training iterations, in which I successively decremented the learning rate and incremented the ratio of flip augmentations.
Sampled outputs
Below are a few cherry-picked in-sample predictions from the model:
Example 1:
- Generated: Chicken Cake
- Original: Chicken French - Rochester , NY Style
- Recipe: all purpose flour ; salt ; eggs ; white sugar ; grated parmesan cheese ; olive oil ; skinless ; butter ; minced garlic ; dry sherry ; lemon juice ; low sodium chicken base ; ;Mix together the flour , salt , and pepper in a shallow bowl . In another bowl , whisk beaten eggs , sugar , and Parmesan cheese until the mixture is thoroughly blended and the sugar has dissolved . Heat olive oil in a large skillet over medium heat until the oil shimmers . Dip the chicken breasts into the flour mixture , then into the egg mixture , and gently lay them into the skillet . Pan-fry the chicken breasts until golden brown and no longer pink in the middle , about 6 minutes on each side . Remove from the skillet and set aside . In the same skillet over medium-low heat , melt the butter , and stir in garlic , sherry , lemon juice , and chicken base ...
Example 2:
- Generated: Fruit Soup
- Original: Red Apple Milkshake
- Recipe: red apple peeled ; cold skim milk ; white sugar ; fresh mint leaves for garnish ; ;In a blender , blend the apple , skim milk , and sugar until smooth . Garnish with mint to serve .
Example 3:
- Generated: Asparagus with Chicken
- Original: Asparagus and Dill Avgolemono Soup
- Recipe: asparagus ; chicken stock ; unsalted butter ; leek ; onion ; ribs celery ; salt ; water ; eggs ; juice of 2 lemons ; minced fresh dill ; dill sprigs for garnish ;Trim off ends of asparagus and using a vegetable peeler remove about 3 to 4-inches of the skin of each stalk , reserving both the ends and peels . Cut asparagus into 1-inch pieces , reserving tips for garnish . In a saucepan combine the asparagus peels and trimmings with the chicken stock , bring to a boil , remove from heat and allow stock to infuse for 15 minutes . Strain stock and reserve . In a pot of salted boiling water blanch the asparagus tips for 2 to 3 minutes , or until brilliant green and barely tender , and then refresh in a bowl of ice water . When tips are chilled , drain and reserve . In a large heavy pot melt the butter over moderate heat and cook the leeks , onion and celery , seasoned with salt and pepper , until softened , about 5 to 8 minutes . Add the 1-inch asparagus pieces and stir to combine ...
Usage (Python 3.6)
- Clone repo:
git clone https://github.com/rtlee9/recipe-summarization.git && cd recipe-summarization
- Initialize submodules:
git submodule update --init --recursive
- Install dependencies [optional: in virtualenv]:
pip install -r requirements.txt
- Setup directories:
python src/config.py
- Download recipes from my Google Cloud Bucket:
wget -P recipe-box/data https://storage.googleapis.com/recipe-box/recipes_raw.zip; unzip recipe-box/data/recipes_raw.zip -d recipe-box/data
(alternatively, see the recipe-box submodule to scrape fresh recipe data)
- Tokenize data:
python src/tokenize_recipes.py
- Initialize word embeddings with GloVe vectors:
- Get GloVe vectors:
wget -P data http://nlp.stanford.edu/data/glove.6B.zip; unzip data/glove.6B.zip -d data
- Initialize embeddings:
python src/vocabulary-embedding.py
- Train model:
python src/train_seq2seq.py
- Make predictions:
python src/predict.py
- Serve predictions from RESTful API:
python src/server.py
Next steps
Aside from tuning hyperparameters, there are a number of ways to potentially improve this model:
- Incorporate ingredients list non-sequentially, and add recipe images (see recipe-box)
- Try different RNN sequence lengths, or variable sequence lengths
- Try different vocabulary sizes
Buy me a coffee
Please consider buying me a coffee if you like my work: