guozix / TaI-DPT

MIT License
85 stars 7 forks source link

TaI code implementation is inconsistent with the paper? #9

Closed iamxiaoyubei closed 10 months ago

iamxiaoyubei commented 10 months ago

The code implementation is inconsistent with the paper? The input here should be text instead of images (Text as Image), but the code here uses images for training.

output, output_local, _, _ = self.model(None, image)

https://github.com/guozix/TaI-DPT/blob/1333ecaa32bfffb4f2eb916f5532afb88ac457fe/trainers/Caption_distill_double.py#L464C65-L464C65

guozix commented 10 months ago

Ah, it's an abuse of the variable name "image". See the definition of the model forward function here. I call the forward function in different ways in training and testing.