HUANGLIZI / LViT

[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"
MIT License
283 stars 26 forks source link

Some question about text feature dimension #44

Closed SCUT-CCNL closed 3 months ago

SCUT-CCNL commented 4 months ago

image image 为什么输入的维度为[210768],batch为2,10是什么意思呢?768是bert编码的特征数?

HUANGLIZI commented 4 months ago

10 is the cut-off size. And you should recheck bert model if you want to explore more options.