issues
search
AtsukiOsanai
/
cv_survey
Personal repository for computer vision survey
2
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
#130
AtsukiOsanai
closed
5 months ago
0
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
#129
AtsukiOsanai
closed
10 months ago
0
Contrastive Feature Masking Open-Vocabulary Vision Transformer
#128
AtsukiOsanai
opened
1 year ago
0
Segment Everything Everywhere All at Once
#127
AtsukiOsanai
opened
1 year ago
0
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
#126
AtsukiOsanai
opened
1 year ago
0
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
#125
AtsukiOsanai
opened
1 year ago
0
Less is More: Removing Text-regions Improves CLIP Training Efficiency and Robustness
#124
AtsukiOsanai
opened
1 year ago
0
Conditional DETR for Fast Training Convergence
#123
AtsukiOsanai
opened
1 year ago
0
PIX2STRUCT: SCREENSHOT PARSING AS PRETRAINING FOR VISUAL LANGUAGE UNDERSTANDING
#122
AtsukiOsanai
opened
1 year ago
0
STRUCTEXTV2: MASKED VISUAL-TEXTUAL PREDICTION FOR DOCUMENT IMAGE PRE-TRAINING
#121
AtsukiOsanai
opened
1 year ago
0
Localization Distillation for Dense Object Detection
#120
AtsukiOsanai
opened
1 year ago
0
Exploring Font-independent Features for Scene Text Recognition
#119
AtsukiOsanai
opened
1 year ago
0
USB: Universal-Scale Object Detection Benchmark
#118
AtsukiOsanai
opened
1 year ago
0
Training data-efficient image transformers & distillation through attention
#117
AtsukiOsanai
opened
1 year ago
0
Towards Universal Object Detection by Domain Attention
#116
AtsukiOsanai
opened
1 year ago
0
TaCo: Textual Attribute Recognition via Contrastive Learning
#115
AtsukiOsanai
opened
1 year ago
0
Let Me Choose: From Verbal Context to Font Selection
#114
AtsukiOsanai
opened
1 year ago
0
De-rendering Stylized Texts
#113
AtsukiOsanai
closed
1 year ago
0
Large-scale Tag-based Font Retrieval with Generative Feature Learning
#112
AtsukiOsanai
opened
1 year ago
0
Learning Visual Importance for Graphic Designs and Data Visualizations
#111
AtsukiOsanai
opened
1 year ago
0
Localization Distillation for Dense Object Detection
#110
AtsukiOsanai
opened
1 year ago
0
Predicting Visual Importance Across Graphic Design Types
#109
AtsukiOsanai
opened
1 year ago
0
Image Aesthetic Assessment Based on Pairwise Comparison – A Unified Approach to Score Regression, Binary Classification, and Personalization
#108
AtsukiOsanai
closed
1 year ago
1
Revisiting Image Aesthetic Assessment via Self-Supervised Feature Learning
#107
AtsukiOsanai
closed
1 year ago
0
Pushing the Performance Limit of Scene Text Recognizer without Human Annotation
#106
AtsukiOsanai
opened
1 year ago
0
Layout-aware Webpage Quality Assessment
#105
AtsukiOsanai
opened
1 year ago
1
生成モデルの評価指標について
#104
AtsukiOsanai
opened
1 year ago
0
e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce
#103
AtsukiOsanai
closed
1 year ago
0
End-to-End Object Detection with Transformers
#102
AtsukiOsanai
opened
1 year ago
0
AttEntropy: Segmenting Unknown Objects in Complex Scenes using the Spatial Attention Entropy of Semantic Segmentation Transformers
#101
AtsukiOsanai
closed
1 year ago
0
PIX2STRUCT: SCREENSHOT PARSING AS PRETRAINING FOR VISUAL LANGUAGE UNDERSTANDING
#100
AtsukiOsanai
closed
1 year ago
0
SPTS: Single-Point Text Spotting
#99
AtsukiOsanai
closed
1 year ago
0
MANGO: A Mask Attention Guided One-Stage Scene Text Spotter
#98
AtsukiOsanai
closed
1 year ago
0
Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
#97
AtsukiOsanai
closed
1 year ago
1
Evaluating Weakly Supervised Object Localization Methods Right
#96
AtsukiOsanai
closed
1 year ago
0
Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
#95
AtsukiOsanai
opened
1 year ago
0
Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
#94
AtsukiOsanai
opened
1 year ago
0
Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
#93
AtsukiOsanai
opened
1 year ago
0
Contextual Text Block Detection towards Scene Text Understanding
#92
AtsukiOsanai
opened
1 year ago
0
From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network
#91
AtsukiOsanai
opened
1 year ago
0
Scene Text Recognition with Permuted Autoregressive Sequence Models
#90
AtsukiOsanai
opened
1 year ago
0
Text is Text, No Matter What: Unifying Text Recognition using Knowledge Distillation
#89
AtsukiOsanai
opened
1 year ago
0
Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition
#88
AtsukiOsanai
opened
1 year ago
0
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
#87
AtsukiOsanai
closed
1 year ago
1
Towards the Unseen: Iterative Text Recognition by Distilling from Errors
#86
AtsukiOsanai
closed
1 year ago
0
PIMNet: A Parallel, Iterative and Mimicking Network for Scene Text Recognition
#85
AtsukiOsanai
closed
2 years ago
0
NON-AUTOREGRESSIVE ASR WITH SELF-CONDITIONED FOLDED ENCODERS
#84
AtsukiOsanai
closed
2 years ago
0
Context-based Contrastive Learning for Scene Text Recognition
#83
AtsukiOsanai
closed
2 years ago
0
SVTR: Scene Text Recognition with a Single Visual Model
#82
AtsukiOsanai
opened
2 years ago
0
Focusing Attention: Towards Accurate Text Recognition in Natural Images
#81
AtsukiOsanai
opened
2 years ago
0
Next