nebgnahz / cv-rs

Rust wrapper for OpenCV (manual at this point)
https://nebgnahz.github.io/cv-rs/cv/
MIT License
204 stars 41 forks source link

feature request: cv::text #59

Closed onelson closed 6 years ago

onelson commented 6 years ago

Specifically, I'm looking for the Tesseract support for scene text extraction.

I've been toying with a cffi binding but it's not great since it requires me to write an image to disk before I can extract text. Something I could run directly on a Mat would be ideal!

Pzixel commented 6 years ago

I started working on it right now. I think i'l be done today or tomorrow. However, I have no chances to check it logically. I can check that it runs and return some results, so if you can elaborate here I'd appreciate :)

Pzixel commented 6 years ago

Hmm, it requires tesseract to be installed. And we should change our pipeline on CI to respect these changes... That may be more complicated than I expected.

onelson commented 6 years ago

When working with tesseract under debian, I had to install the packages you have added, but language data was in a separate package (eg tesseract-ocr-eng). Still, on ubuntu I didn't have to explicitly install the lang, and the CI failure appears to be something to do with camera control? Odd.

I can switch to this branch in my project tonight and report back. Give me a few to sort out the details ;)

Pzixel commented 6 years ago

@onelson if you are talking about Failed to initialize libdc1394 then nevermind, it's not an error but just a warning, which is written with errorprefix for some reason.

There is no actual error printed, and I didn't succeed to install tesseract on Windows, this is why I'm trying to reproduce it locally using WSL.

About tesseract-ocr-eng it's actually gets installed, because it's one of dependencies of tesseract itself, see build log: https://travis-ci.org/nebgnahz/cv-rs/jobs/340596926#L469 .

Don't know why it's failing though.

Pzixel commented 6 years ago

running 1 test error: process didn't exit successfully: /home/travis/build/nebgnahz/cv-rs/target/debug/deps/test_ocr-2900e87495c49434 (signal: 11, SIGSEGV: invalid memory reference)

Look like it actually works but I messed with memory a bit :)

Pzixel commented 6 years ago

Well, I managed to reinstall vmware/reinstall WSL and install ubuntu 17.10 on my HyperV VM :D Lots of related software installed is here too, of course.

I found that this code is causing SEGFAULT:

void cv_to_ffi(const std::string& source, char** dest) {
    auto result = new char[source.length() + 1];
    strcpy(result, source.c_str());
    *dest = result;
}

@nebgnahz can you elaborate, please? I don't see any wrong with this code...

Oh I see, it's a null pointer derefernce here.. How could output char* parameter be written then?

Pzixel commented 6 years ago

I fixed it, it should be working now. @onelson you can try it by referencing developing branch:

[dependencies]
cv-rs= { git = "https://github.com/Pzixel/cv-rs.git", branch = "feature/ocr" }

I'm going to add more overloads for run and some stuff, but something is already may be working :)

onelson commented 6 years ago

Passing the data path, and lang through to OcrTesseract::new() seems to be ignored (in a harmful way). If I omit data path and lang, then things seem to work ok, but otherwise the path is treated as / and the lang as NULL.

Given:

    let ocr = OcrTesseract::new(
        Some(PathBuf::from("/usr/share/tesseract-ocr").as_path()),
        Some("eng"),
        None,
        EngineMode::Default,
        PageSegmentationMode::Auto,
    );

The result is:

OCRTesseract: Could not initialize tesseract.
Error opening data file /tessdata/NULL.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'NULL'
Tesseract couldn't load any languages!
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', /checkout/src/libcore/option.rs:335:20
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
             at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at /checkout/src/libstd/sys_common/backtrace.rs:68
   2: std::panicking::default_hook::{{closure}}
             at /checkout/src/libstd/sys_common/backtrace.rs:57
             at /checkout/src/libstd/panicking.rs:381
   3: std::panicking::default_hook
             at /checkout/src/libstd/panicking.rs:397
   4: std::panicking::rust_panic_with_hook
             at /checkout/src/libstd/panicking.rs:577
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:538
   6: std::panicking::begin_panic_fmt
             at /checkout/src/libstd/panicking.rs:522
   7: rust_begin_unwind
             at /checkout/src/libstd/panicking.rs:498
   8: core::panicking::panic_fmt
             at /checkout/src/libcore/panicking.rs:71
   9: core::panicking::panic
             at /checkout/src/libcore/panicking.rs:51
  10: __rust_maybe_catch_panic
             at /checkout/src/libcore/macros.rs:20
             at /checkout/src/libpanic_unwind/gcc.rs:101
             at /checkout/src/libpanic_unwind/lib.rs:104
  11: std::rt::lang_start
             at /checkout/src/libstd/panicking.rs:459
             at /checkout/src/libstd/panic.rs:365
             at /checkout/src/libstd/rt.rs:58
  12: __libc_start_main
  13: _start
fatal runtime error: failed to initiate panic, error 5
onelson commented 6 years ago

This is remarkably unhelpful, but this also seems a bit "crashy." I've been running this program in a docker container, which makes getting core dumps a little awkward. I'm in the process of getting all the libs built on bare metal so I can try and look into what's going on.

Pzixel commented 6 years ago

Yep, it's buggy a bit, the main reason is that I can't test it on my primary environment: tesseract installation on windows is too painful so I don't do it. And this makes me to use VM with ubuntu, which is not full-featured atm.

About ctor: I actually didn't check it because I started writing code at 10AM and finished at 3 AM at night and I wanted to take some sleep 😄 when I realised, that basic functionality is working. Now I can add more details/cfg/etc to make it more usable and stable. But not everything comes at once 😄

onelson commented 6 years ago

While pulling Mats from mp4 and running them through ocr, I will occasionally get row >= col:Error:Assert failed:in file ../ccstruct/matrix.h, line 339

Sometimes there are other assertions that bark as the program dies.

Some core dump:

Core was generated by `target/debug/balcony data/Dark Souls 3 pt 21.mp4'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f7761b906bb in ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const () from /usr/lib/x86_64-linux-gnu/libte
sseract.so.3
[Current thread is 1 (Thread 0x7f776bc49cc0 (LWP 3694))]
#0  0x00007f7761b906bb in ERRCODE::error(char const*, TessErrorLogCode, char const*, ...) const () at /usr/lib/x86_64-linux-gnu/libtess
eract.so.3
#1  0x00007f7761b03149 in tesseract::Wordrec::UpdateSegSearchNodes(float, int, GenericVector<tesseract::SegSearchPending>*, WERD_RES*, 
tesseract::LMPainPoints*, tesseract::BestChoiceBundle*, BlamerBundle*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#2  0x00007f7761b03ecf in tesseract::Wordrec::SegSearch(WERD_RES*, tesseract::BestChoiceBundle*, BlamerBundle*) () at /usr/lib/x86_64-l
inux-gnu/libtesseract.so.3
#3  0x00007f7761af3c17 in tesseract::Wordrec::chop_word_main(WERD_RES*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#4  0x00007f7761b04da1 in tesseract::Wordrec::cc_recog(WERD_RES*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#5  0x00007f7761a53ad2 in tesseract::Tesseract::recog_word_recursive(WERD_RES*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#6  0x00007f7761a53c85 in tesseract::Tesseract::recog_word(WERD_RES*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#7  0x00007f7761a45b10 in tesseract::Tesseract::tess_segment_pass_n(int, WERD_RES*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#8  0x00007f7761a0bfa8 in tesseract::Tesseract::match_word_pass_n(int, WERD_RES*, ROW*, BLOCK*) () at /usr/lib/x86_64-linux-gnu/libtess
eract.so.3
#9  0x00007f7761a0c18f in tesseract::Tesseract::classify_word_pass1(tesseract::WordData const&, WERD_RES**, tesseract::PointerVector<WE
RD_RES>*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#10 0x00007f7761a0d429 in tesseract::Tesseract::RetryWithLanguage(tesseract::WordData const&, void (tesseract::Tesseract::*)(tesseract:
:WordData const&, WERD_RES**, tesseract::PointerVector<WERD_RES>*), WERD_RES**, tesseract::PointerVector<WERD_RES>*) () at /usr/lib/x86
_64-linux-gnu/libtesseract.so.3
#11 0x00007f7761a0dc47 in tesseract::Tesseract::classify_word_and_language(int, PAGE_RES_IT*, tesseract::WordData*) () at /usr/lib/x86_
64-linux-gnu/libtesseract.so.3
#12 0x00007f7761a10fa0 in tesseract::Tesseract::RecogAllWordsPassN(int, ETEXT_DESC*, PAGE_RES_IT*, GenericVector<tesseract::WordData>*)
 () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#13 0x00007f7761a12b70 in tesseract::Tesseract::recog_all_words(PAGE_RES*, ETEXT_DESC*, TBOX const*, char const*, int) () at /usr/lib/x
86_64-linux-gnu/libtesseract.so.3
#14 0x00007f77619fd10f in tesseract::TessBaseAPI::Recognize(ETEXT_DESC*) () at /usr/lib/x86_64-linux-gnu/libtesseract.so.3
#15 0x00007f7766655056 in cv::text::OCRTesseractImpl::run(cv::Mat&, std::__cxx11::basic_string<char, std::char_traits<char>, std::alloc
ator<char> >&, std::vector<cv::Rect_<int>, std::allocator<cv::Rect_<int> > >*, std::vector<std::__cxx11::basic_string<char, std::char_t
raits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >
*, std::vector<float, std::allocator<float> >*, int) () at /usr/local/lib/libopencv_text.so.3.4
#16 0x0000560deccc8fec in cv_ocr_run(cv::Ptr<cv::text::BaseOCR>&, cv::Mat&, CDisposableString*, CVec<Rect>*, CVec<CDisposableString>*, 
CVec<float>*, int) (ocr=..., image=..., output_text=0x7fff0750baa8, component_rects=0x7fff0750bab0, component_texts=0x7fff0750bac0, com
ponent_confidences=0x7fff0750bad0, component_level=1) at native/opencv-wrapper.cc:673
#17 0x0000560deccc513e in cv::text::ocr::{{impl}}::run<cv::text::ocr::OcrTesseract> (self=0x7fff0750bbd8, image=0x7fff0750bc60, compone
nt_level=cv::text::ocr::ComponentLevel::TextLine) at /home/owen/.cargo/git/checkouts/cv-rs-42de8274791f3394/f4723ff/src/text/ocr.rs:136
#18 0x0000560deccc449d in balcony::run (url=...) at src/main.rs:39
#19 0x0000560deccc47eb in balcony::main () at src/main.rs:55
exe = 'target/debug/balcony data/Dark Souls 3 pt 21.mp4'

Thanks for your attention on this @Pzixel! It's a great start!

Pzixel commented 6 years ago

That's weird, it works on playground, but it doesn't work locally. what a heck )

onelson commented 6 years ago

I bet it's my usage. I bet they are being freed since I was not holding on to them outside of the parameter list.

Pzixel commented 6 years ago

No, it isn't. I found the error, though don't know why it happens: https://play.rust-lang.org/?gist=f4aa2cff45ab20fc060f20eaae6ce4b2&version=stable .

However, it doesn't explain your sudden panics, they shouldn't be.

Pzixel commented 6 years ago

@onelson now tesseract should init correctly.

onelson commented 6 years ago

It does indeed. I was able to specify lang without specifying a data dir (which wasn't possible before). Nice!

Pzixel commented 6 years ago

Sounds good.

I also implemented Hmm. It seems to be working, except it actually doesn't. I mean it runs but don't return expected results. @onelson it seems that you are kinda experienced in such matter, can you take a look on test and say why it could return wrong value? https://github.com/Pzixel/cv-rs/blob/feature/ocr/tests/test_text.rs#L41 helloworld

Actually, it returns m for any image. No idea why it's happening.

Pzixel commented 6 years ago

Rust code is fine, exactly same behaviour saves for C++ code too:

std::string image_path = "assets/ubuntu.png";
std::string filename = "assets/OCRHMM_transitions_table.xml";
auto image_source = cv::imread(image_path);
cv::Mat image;
cv::cvtColor(image_source, image, 6);
cv::Mat transition_p;
cv::FileStorage fs(filename, cv::FileStorage::READ);
fs["transition_probabilities"] >> transition_p;
fs.release();
cv::Mat emission_p = cv::Mat::eye(62,62,CV_64FC1);
std::string voc = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
auto classifier = cv::text::loadOCRHMMClassifierNM("assets/OCRHMM_knn_model_data.xml.gz");
auto ocr = cv::text::OCRHMMDecoder::create(classifier, voc, transition_p, emission_p);

std::string output;
std::vector<cv::Rect> boxes;
std::vector<std::string> words;
std::vector<float> confidences;
ocr.get()->run(image, output, &boxes, &words, &confidences);
std::cout << output;
onelson commented 6 years ago

Actually, it returns m for any image. No idea why it's happening.

Same. I've seen similar in my tests. Sometimes if the frame has no glyphs in it, it'll report back with "m" or "v" and I'm not sure why.

Pzixel commented 6 years ago

https://github.com/opencv/opencv_contrib/issues/1557

Pzixel commented 6 years ago

@onelson well, I implemented Tesseract, HMM and HolisticWord. They all seem to be full-working, except that I didn't check HolisticWord work in practice because it requires multigigabyte database I can't download atm. HMM works strangely, as you know, but we can't fix it in any way, because we are just a proxy to underlying C++ code. The only thing that I can defenitely say is working as expected is Tesseract: both Word and TextLine recognision modes works great. However, others should be working, at least with same results as called via C++ code.

I didn't implement beam decoder because I didn't find it in recent examples, so it's probably deprecated or something.

onelson commented 6 years ago

Sounds good. I'm still seeing segfaults after a few seconds of processing frames of video, but I haven't taken the time to see if I'd get the same result using C++ directly or not (I don't have much C++ experience, so it's scary).

Thanks for doing this!

Pzixel commented 6 years ago

Are you still have issues with latest code in the branch? Because there was some issues with segfaults, but I fixed them and they shouldn't be a problem anymore. Rust memory guarantees guarantee that there shouldn't be anything similar to segfaults under workload. That's scares me a bit.

And yep, you may not close an issue yourself, because it will be closed automatically when PR is merged in master branch.

onelson commented 6 years ago

Ah! I tested last night (mmm, 16-ish hours ago). I see you have committed since. I'll try it again shortly and let you know!

onelson commented 6 years ago

@Pzixel A+! :confetti_ball: Watched it process several minutes of footage, and didn't quit until I asked it to.