Bartzi / see

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"
GNU General Public License v3.0
574 stars 147 forks source link

fsns problem #45

Open wushilian opened 6 years ago

wushilian commented 6 years ago

Hi,fsns has 4 views img,so how to deal 4 imgs by one network?

Bartzi commented 6 years ago

Do you want to know how we deal with the 4 different views at the same time?

This is actually quite easy:

  1. we take the 4 views and split them into 4 different images
  2. those 4 different images are put into the feature extractor (CNN) as independent images
  3. after the extraction of features we fuse all 4 views together by concatenating them at the channel dimension (in our case dimension 1)
  4. We get a prediction using all 4 images at the same time =)
wushilian commented 6 years ago

@Bartzi Thank you very much,by the way,fsns has the blank label ‘ ’,How do you to deal with it?

Bartzi commented 6 years ago

do you have problems with the blank label as the result of your prediction? If so, you can just strip this label from your predicted word.

wushilian commented 6 years ago

@Bartzi thank you very much!

chunhui999 commented 5 years ago

@Bartzi Hi, could you tell me where can I find the code about the way to deal with the 4 different views at the same time? Thank you.

Bartzi commented 5 years ago

Which part are you interested in? The part where we split the input image into 4 independent images, or the part where the features of all four images are fused together?

chunhui999 commented 5 years ago

If convenient, I wanna know both, because I think it's a complete process.

Bartzi commented 5 years ago

We split the views here and reunite them here

chunhui999 commented 5 years ago

Thanks a lot.^_^