[Project Website]
This is the official implementation of CLIPasso, a method for converting an image of an object to a sketch, allowing for varying levels of abstraction.
At a high level, we define a sketch as a set of Bézier curves and use a differentiable rasterizer (diffvg) to optimize the parameters of the curves directly with respect to a CLIP-based perceptual loss.
We combine the final and intermediate activations of a pre-trained CLIP model to achieve both geometric and semantic simplifications.
The abstraction degree is controlled by varying the number of strokes.
You can simply pull the docker image from docker hub, containing all the required libraries and packages:
docker pull yaelvinker/clipasso_docker
docker run --name clipsketch -it yaelvinker/clipasso_docker /bin/bash
Now you should have a running container. Inside the container, clone the repository:
cd /home
git clone https://github.com/yael-vinker/CLIPasso.git
cd CLIPasso/
Now you are all set and ready to move to the next stage (Run Demo).
Note that it is recommended to use the provided docker image, as we rely on diffvg which has specific requirements and does not compile smoothly on every environment.
git clone https://github.com/yael-vinker/CLIPasso.git
cd CLIPasso
python3.7 -m venv clipsketch
source clipsketch/bin/activate
pip install -r requirements.txt
pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install git+https://github.com/openai/CLIP.git
git clone https://github.com/BachiLi/diffvg
cd diffvg
git submodule update --init --recursive
python setup.py install
The input images to be drawn should be located under "target_images". To sketch your own image, from CLIPSketch run:
python run_object_sketching.py --target_file <file_name>
The resulting sketches will be saved to the "output_sketches" folder, in SVG format.
Optional arguments:
--num_strokes
Defines the number of strokes used to create the sketch, which determines the level of abstraction. The default value is set to 16, but for different images, different numbers might produce better results. --mask_object
It is recommended to use images without a background, however, if your image contains a background, you can mask it out by using this flag with "1" as an argument.--fix_scale
If your image is not squared, it might be cut off, it is recommended to use this flag with 1 as input to automatically fix the scale without cutting the image.--num_sketches
As stated in the paper, by default there will be three parallel running scripts to synthesize three sketches and automatically choose the best one. However, for some environments (for example when running on CPU) this might be slow, so you can specify --num_sketches 1 instead.-cpu
If you want to run the code on the cpu (not recommended as it might be very slow).
For example, below are optional running configurations:
Sketching the camel with defauls parameters:
python run_object_sketching.py --target_file "camel.png"
Producing a single sketch of the camel at lower level of abstraction with 32 strokes:
python run_object_sketching.py --target_file "camel.png" --num_strokes 32 --num_sketches 1
Sketching the flamingo with higher level of abstraction, using 8 strokes:
python run_object_sketching.py --target_file "flamingo.png" --num_strokes 8
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders, 2021 (Kevin Frans, L.B. Soros, Olaf Witkowski)
Diffvg: Differentiable vector graphics rasterization for editing and learning, ACM Transactions on Graphics 2020 (Tzu-Mao Li, Michal Lukáč, Michaël Gharbi, Jonathan Ragan-Kelley)
If you make use of our work, please cite our paper:
@misc{vinker2022clipasso,
title={CLIPasso: Semantically-Aware Object Sketching},
author={Yael Vinker and Ehsan Pajouheshgar and Jessica Y. Bo and Roman Christian Bachmann and Amit Haim Bermano and Daniel Cohen-Or and Amir Zamir and Ariel Shamir},
year={2022},
eprint={2202.05822},
archivePrefix={arXiv},
primaryClass={cs.GR}
}
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.