Thanks for your code. I'm trying to modify your toolkit to multiple labels per input image. The use case is OCR. If I pass in a image with a string of digits I want to get a prediction for each digit in the image. We right pad to make each prediction the same length.
How can I modify the code so the prediction for each input image output by your CapsNet architecture has a length > 1? So instead of predicting for a single output from 10 choices (0-9) we are predicting for multiple outputs each of them with a chance of being 0-9?
Hello @XifengGuo ,
Thanks for your code. I'm trying to modify your toolkit to multiple labels per input image. The use case is OCR. If I pass in a image with a string of digits I want to get a prediction for each digit in the image. We right pad to make each prediction the same length.
How can I modify the code so the prediction for each input image output by your CapsNet architecture has a length > 1? So instead of predicting for a single output from 10 choices (0-9) we are predicting for multiple outputs each of them with a chance of being 0-9?
Thank you.