-
### Deep Learning Simplified Repository (Proposing new issue)
:red_circle: Separating text from image :
:red_circle:Aim of the project is to provide users with a code that can help them take out t…
-
### Current Behavior
`#include
#include
#include
#include
#include
#include
#pragma comment(lib, "tesseract54.lib")
std::mutex io_mutex;
void performOCR(const std::string& imagePath…
-
Many thanks for the contribution,
although the utterance segmentation is not a part of your work (the IEMOCAP emotion dataset is already segmented into utterances), do you have any idea about any too…
-
We are missing documentation for examples in the following tasks + file types.
(Based on the file types that we do accept but are missing examples.)
- named-entity-recognition: system output - js…
-
# speech recognition
- Soltau, Hagen, Hank Liao, and Hasim Sak. "Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition." arXiv preprint arXiv:1610.09975 (201…
-
The current [specification](https://ocr-d.github.io/glossary#OCR) is agnostic about which **level of segmentation** OCR is supposed to operate on, either `TextLine` layout input (for `TextLine`, `Word…
-
`torchaudio` is an extension library for PyTorch, designed to facilitate audio processing using the same PyTorch paradigms familiar to users of its tensor library. It provides powerful tools for audio…
-
# Current situation
Users cannot readily use the PAGE-XML results of Transkribus in an OCR-D environment, because Transkribus' flavor of PAGE-XML is based on the older 2013 variant and contains pro…
-
https://www.bairesdev.com/blog/java-nlp-libraries-tools/
-
## Reference
- [paper - 2019 - TextScanner: Reading Characters in Order for Robust Scene Text Recognition
](https://arxiv.org/pdf/1912.12422.pdf)
- [旷视研究院提出TextScanner:确保字符阅读顺序,实现文字识别新突破](https://z…