I think Crop-CLIP should be put at Object Detection
For more image manipuliation / generation applications, I summarized in my medium
You can add them if you think they are valueable.
Text-Driven Image Manipulation/Generation with CLIP
The list of above image is allocated at this google sheet
Text2Mesh : Their approach can modify a given mesh with given text/image information via CLIP text/image encoder.
Detecting Twenty-thousand Classes using Image-level Supervision : Which is a object detection research by facebook, they use CLIP text embedding as classifier weight.
The above papers are I want to add.
I think Crop-CLIP should be put at Object Detection
For more image manipuliation / generation applications, I summarized in my medium You can add them if you think they are valueable. Text-Driven Image Manipulation/Generation with CLIP The list of above image is allocated at this google sheet