CrossmodalGroup / LAPS

Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment, CVPR, 2024
82 stars 8 forks source link

感谢您的工作!想问一下LPS、SPC和SPA具体在代码中是如何实现的 #3

Open SUZILI7 opened 2 months ago

SUZILI7 commented 2 months ago

这几个分别对应的是TokenSparse类,TokenAggregation类和CrossSparseAggrNet_v2类吗

darkpromise98 commented 2 months ago

You're basically right.

LPS (patch selection) is for class TokenSparse
https://github.com/CrossmodalGroup/LAPS/blob/main/lib/cross_net.py#L14

SPC (patch calibration/aggregation) is for class TokenAggregation https://github.com/CrossmodalGroup/LAPS/blob/main/lib/cross_net.py#L61

SPA (patch-word alignment) is for def mask_xattn_one_text https://github.com/CrossmodalGroup/LAPS/blob/main/lib/cross_net.py#L203 https://github.com/CrossmodalGroup/LAPS/blob/main/lib/xttn.py#L225

SUZILI7 commented 2 months ago

非常感谢!