-
## ASR
- [ ] ASR2K: Speech Recognition for Around 2000 Languages without Audio https://arxiv.org/abs/2209.02842
- [x] Whisper: Whisper is a general-purpose speech recognition model. https://github…
-
-
Could you add a demo ipynb notebooks for table_recognition, text_detection, text_ie, text_recognition, text_spotting, videotext to work in colab.
-
I wanted to compare the results for Docker, with three configurations - aarch64 vs x86 (Rosetta) vs x86 (QEMU) on my MacBookAir10,1 as I've been thinking about moving my home NAS which mostly runs x86…
-
Thanks for creating this package!
As discussed in https://github.com/robertknight/ocrs/issues/14 it would be nice to add some evaluation benchmarks. And maybe optionally compare with tesseract or s…
-
Possible new applications / examples could be:
- Speech recognition
- Text detection and recognition
- EfficientDet
- Human 2D keypoint estimation
- Hand 2D keypoint estimation
-
Use of a multi-task network is required to perform object detection and text recognition on images.
-
Traceback (most recent call last):
File "C:/Users/jaysh/Downloads/Real_time_Object_detection_TF-master/object_recognition_detection/object_detection_webcam.py", line 69, in
label_map = label_…
-
In the inference, the detected box by the detection model will be padded before to be fed to the recognition model.
This request is about making this padding configurable.
Currently it is inconven…
-
http://lampsrv02.umiacs.umd.edu/pubs/Papers/qixiangye-14/qixiangye-14.pdf
link 404