amusi / CVPR2024-Papers-with-Code

CVPR 2024 论文和开源项目合集
17.05k stars 2.54k forks source link

欢迎分享CVPR 2023 论文和代码 / Welcome to share the paper and code of CVPR 2023 #166

Closed amusi closed 4 months ago

amusi commented 1 year ago

[The format of the issue] Paper name/title: Paper link: Code link:

Gaaaavin commented 1 year ago

Paper title: DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization Paper link: https://arxiv.org/abs/2212.06331 Code link: https://github.com/ai4ce/DeepMapping2

Gaaaavin commented 1 year ago

Paper title: VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion Paper link: https://arxiv.org/abs/2302.12251 Code link: https://github.com/NVlabs/VoxFormer

joellliu commented 1 year ago

Paper title: PolyFormer: Referring Image Segmentation as Sequential Polygon Generation" Paper link: https://arxiv.org/abs/2302.07387

[update May 12th 2023] @amusi Could you please add the code link as well? https://github.com/amazon-science/polygon-transformer

Thank you!

FingerRec commented 1 year ago

Paper title: All in One: Exploring Unified Video-Language Pre-training Paper link: https://arxiv.org/abs/2203.07303 Code link: https://github.com/showlab/all-in-one

Paper title: Position-guided Text Prompt for Vision Language Pre-training Paper link: https://arxiv.org/abs/2212.09737 Code link: https://github.com/sail-sg/ptp

tobran commented 1 year ago

Paper title: GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis Paper link: https://arxiv.org/abs/2301.12959 Code link: https://github.com/tobran/GALIP

GALIP是一个简单,快速,高质量的文本到图像生成模型,对比需要数百张GPU,400M图文对,数周时间进行预训练的Diffusion Model和Autoregressive Model,GALIP仅需8张3090,12M图文对,3天的预训练时间,取得了相当甚至更好的效果,同时生成速度提高了120倍,支持仅CPU生成。代码和预训练模型已经开源。

GALIP is a simple, fast, and high-quality Text-to-Image Generative Model with comparable results to large pretrained Autoregressive and Diffusion models and 120 times faster synthesis speed. Compared with the Diffusion and Autoregressive models which require hundreds of GPUs, 400M image-text pairs, and several weeks for pre-training, GALIP only needs 8x3090, 12M image-text pairs, and 3 days for pre-training. Furthermore, GALIP can be used without GPU. The code and pre-trained models have been released.

aminshabani commented 1 year ago

Paper title: HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising Project page: https://aminshabani.github.io/housediffusion/ Paper link: https://arxiv.org/abs/2211.13287 Code link: https://github.com/aminshabani/house_diffusion

GenjiB commented 1 year ago

Vision Transformers are Parameter-Efficient Audio-Visual Learners Project page: https://yanbo.ml/project_page/LAVISH/ code: https://github.com/GenjiB/LAVISH

wuxiaolang commented 1 year ago

3D Visual and Language Paper title: EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding Paper link: https://arxiv.org/abs/2209.14941 Code link: https://github.com/yanmin-wu/EDA

Gaaaavin commented 1 year ago

Paper title: DeepMapping2: Self-Supervised Large-Scale LiDAR Map Optimization Paper link: https://arxiv.org/abs/2212.06331 Code link: https://github.com/ai4ce/DeepMapping2

This paper can be categorized into "3D point cloud"

pengzhiliang commented 1 year ago

Paper title: Generic-to-Specific Distillation of Masked Autoencoders Paper link: https://arxiv.org/abs/2302.14771 Code link: https://github.com/pengzhiliang/G2SD

This paper can be categorized into "Knowledge Distillation " or "Masked Autoencoders". Thank you!

jiamingzhang94 commented 1 year ago

Paper title: Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples Paper link: https://arxiv.org/abs/2301.01217 Code link: https://github.com/jiamingzhang94/Unlearnable-Clusters

realPasu commented 1 year ago

Paper title: MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation Paper link: https://arxiv.org/abs/2212.09478 Code link: https://github.com/researchmm/MM-Diffusion

noahzn commented 1 year ago

Paper title: Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation Paper link: https://arxiv.org/abs/2211.13202 Code link: https://github.com/noahzn/Lite-Mono

WentianZhang-ML commented 1 year ago

Paper title: AdaptiveMix: Robust Feature Representation via Shrinking Feature Space Paper link: https://arxiv.org/pdf/2303.01559.pdf Code link: https://github.com/WentianZhang-ML/AdaptiveMix

ymy-k commented 1 year ago

Paper title: DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting Paper link: https://arxiv.org/pdf/2211.10772v3.pdf Code link: https://github.com/ViTAE-Transformer/DeepSolo

Thank you!

VainF commented 1 year ago

Paper title: DepGraph: Towards Any Structural Pruning Paper link: https://arxiv.org/abs/2301.12900 Code link: https://github.com/VainF/Torch-Pruning

Thank you! This paper should be categorized as Network Pruning

Jialing-Zhang commented 1 year ago

Paper title: Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption Paper link: https://arxiv.org/abs/2207.03442 Code link: https://github.com/shiyegao/DDA

Thank you!

limacv commented 1 year ago

Paper title: 3D Video Loops from Asynchronous Input Paper link: https://arxiv.org/abs/2303.05312 Project page: https://limacv.github.io/VideoLoop3D_web/ Code link: https://github.com/limacv/VideoLoop3D

This paper should be in a new category named 新视点合成(Novel View Synthesis), which I believe is also a hot topic with many more papers. But it can also be categorized as NeRF if no more sections can be added. Thank you!

2y7c3 commented 1 year ago

Paper title: Super-Resolution Neural Operator Paper link: https://arxiv.org/abs/2303.02584 Code link: https://github.com/2y7c3/Super-Resolution-Neural-Operator

stdKonjac commented 1 year ago

Paper name/title: Learning Transferable Spatiotemporal Representations from Natural Script Knowledge Paper link: https://arxiv.org/abs/2209.15280 Code link: https://github.com/TencentARC/TVTS

Carlyx commented 1 year ago

Paper name/title: DPE: Disentanglement of Pose and Expression for General Video Portrait Editing Paper link: https://arxiv.org/abs/2301.06281 Code link: https://carlyx.github.io/DPE/

vinthony commented 1 year ago

paper name/title: SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation Paper link: https://arxiv.org/abs/2211.12194 Code link: https://github.com/Winfredy/SadTalker

slacklife commented 1 year ago

Paper name/title: DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network Paper link: https://arxiv.org/abs/2303.02165 Code link: https://github.com/alibaba/lightweight-neural-architecture-search

Please put it in the backbone chapter of the README.md.

Yueming6568 commented 1 year ago

Paper title: DeltaEdit: Exploring Text-free Training for Text-driven Image Manipulation Paper link: https://arxiv.org/abs/2303.06285 Code link: https://github.com/Yueming6568/DeltaEdit

Thank you : ) please put it in the GAN/CLIP/image manipulation/ image generation chapters.

rayleizhu commented 1 year ago

The Arxiv link for BiFormer is now available. Please update. Thanks!

Paper name/title: BiFormer: Vision Transformer with Bi-Level Routing Attention Paper link: https://arxiv.org/abs/2303.08810 Code link: https://github.com/rayleizhu/BiFormer

dingfengshi commented 1 year ago

Paper title: TriDet: Temporal Action Detection with Relative Boundary Modeling Paper link: https://arxiv.org/pdf/2303.07347.pdf Code link: https://github.com/dingfengshi/TriDet

maybe it can be put it in Video Understanding or a new chapter Action Detection? Thank you!

lixinustc commented 1 year ago

Thanks for it. Paper title: Causal-IR: Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective Paper link: https://arxiv.org/pdf/2303.06859.pdf Code link: https://github.com/lixinustc/Casual-IR-DIL

The code will be released soon.

l1997i commented 1 year ago

Paper title: Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation Paper link: https://arxiv.org/abs/2303.11203 Code link: https://github.com/l1997i/lim3d

The code will be released soon. Thanks in advance!

ShirleyMaxx commented 1 year ago

Paper name/title: GFPose: Learning 3D Human Pose Prior with Gradient Fields Paper link: https://arxiv.org/pdf/2212.08641.pdf Code link: https://github.com/Embracing/GFPose

Thank you!

shikiw commented 1 year ago

Paper name/title: Diversity-Aware Meta Visual Prompting Paper link: https://arxiv.org/abs/2303.08138 Code link: https://github.com/shikiw/DAM-VP

Thanks a lot!

XinyuSun commented 1 year ago

Paper name/title: Masked Motion Encoding for Self-Supervised Video Representation Learning Paper link: https://arxiv.org/abs/2210.06096 Code link: https://github.com/XinyuSun/MME

Thanks a lot!

nihaomiao commented 1 year ago

Paper name/title: Conditional Image-to-Video Generation with Latent Flow Diffusion Models Paper link: https://arxiv.org/pdf/2303.13744.pdf Code link: https://github.com/nihaomiao/CVPR23_LFDM

Thanks a lot!

gmkim-ai commented 1 year ago

Paper title: Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding Paper link: https://arxiv.org/abs/2212.02802 Project page: https://diff-video-ae.github.io Code link: https://github.com/man805/Diffusion-Video-Autoencoders

Thank you!

sjtuxcx commented 1 year ago

Paper title: EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning Paper link: https://arxiv.org/abs/2303.10876 Code link: https://github.com/MediaBrain-SJTU/EqMotion

Thanks a lot!

dailenson commented 1 year ago

OCR Paper title: Disentangling Writer and Character Styles for Handwriting Generation Paper link: https://arxiv.org/abs/2303.14736 Code link: https://github.com/dailenson/SDT Thanks a lot!

MrGiovanni commented 1 year ago

医学影像分析

  1. Label-Free Liver Tumor Segmentation Paper: https://arxiv.org/pdf/2303.14869.pdf Code: https://github.com/MrGiovanni/SyntheticTumors

  2. SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection Paper: https://arxiv.org/pdf/2111.13495.pdf Code: https://github.com/tiangexiang/SQUID

Thank you!

lizhou-cs commented 1 year ago

Paper title:Joint Visual Grounding and Tracking with Natural Language Specification Paper link: https://arxiv.org/abs/2303.12027 Code link: https://github.com/lizhou-cs/JointNLT

JunweiZheng93 commented 1 year ago

CVPR 2023 Highlight! Title: Attention-based Point Cloud Edge Sampling Link: https://arxiv.org/abs/2302.14673 Code: https://github.com/JunweiZheng93/APES Project page: https://junweizheng93.github.io/publications/APES/APES.html

Thx!

bmlklwx commented 1 year ago

[CVPR2023 Highlight] Paper name/title: Marching-Primitives: Shape Abstraction from Signed Distance Function Paper link: https://arxiv.org/abs/2303.13190 Code link: https://github.com/ChirikjianLab/Marching-Primitives

Thanks!

ShirleyMaxx commented 1 year ago

Paper name/title: 3D Human Mesh Estimation from Virtual Markers Paper link: https://arxiv.org/pdf/2303.11726.pdf Code link: https://github.com/ShirleyMaxx/VirtualMarker

Thank you!

monsoon235 commented 1 year ago

Paper name/title: Adaptive Spot-Guided Transformer for Consistent Local Feature Matching Paper link: https://arxiv.org/abs/2303.16624 Code link: https://github.com/ASTR2023/ASTR Homepage link: https://astr2023.github.io

haoyuc commented 1 year ago

Paper name/title: Masked Image Training for Generalizable Deep Image Denoising Paper link: https://arxiv.org/abs/2303.13132 Code link: https://github.com/haoyuc/MaskedDenoising

lkeab commented 1 year ago

Paper name/title: Mask-Free Video Instance Segmentation Paper link: https://arxiv.org/abs/2303.15904 Code link: https://github.com/SysCV/MaskFreeVis

SiyuanYan1 commented 1 year ago

Paper name/title:Towards Trustable Skin Cancer Diagnosis via Rewriting Model's Decision Paper link:https://arxiv.org/pdf/2303.00885.pdf

youngLBW commented 1 year ago

Paper name/title: A Hierarchical Representation Network for Accurate and Detailed Face Reconstruction from In-The-Wild Images Paper link: https://arxiv.org/abs/2302.14434 Code link: https://younglbw.github.io/HRN-homepage/

cwhgn commented 1 year ago

Paper name/title: Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks Paper link: https://arxiv.org/abs/2303.17602 Code link: https://github.com/tinyvision/SOLIDER

weizeming commented 1 year ago

Paper name/title: CFA: Class-wise Calibrated Fair Adversarial Training Paper link: https://arxiv.org/abs/2303.14460 Code link: https://github.com/PKU-ML/CFA

HuiZhang0812 commented 1 year ago

Paper title: Prototypical Residual Networks for Anomaly Detection and Localization Paper link: https://arxiv.org/abs/2212.02031

kwonjunn01 commented 1 year ago

Paper name/title: Probabilistic Prompt Learning for Dense Prediction Paper link: http://arxiv.org/abs/2304.00779

azhuantou commented 1 year ago

Paper title: Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection Paper link: https://arxiv.org/abs/2304.01464 Code link: https://github.com/azhuantou/HSSDA

Thank you : ) please put it in the point cloud/semi-supervised 3D object detection chapters.