zhiqwang / sightseq

Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
MIT License
125 stars 34 forks source link
attention crnn ctc densenet faster-rcnn image-captioning mobilenet object-detection ocr pytorch scene-texts text-recognition transformer

🔭sightseq

Now, Let's go sightseeing by vision and sequence language multimodal around the deep learning world.

What's New:

Features:

sightseq provides reference implementations of various deep learning tasks, including:

Additionally:

General Requirements and Installation

Pre-trained models and examples

License

sightseq is MIT-licensed. The license applies to the pre-trained models as well.