MaaAssistantArknights / MaaAssistantArknights

《明日方舟》小助手,全日常一键长草!| A one-click tool for the daily tasks of Arknights, supporting all clients.
https://maa.plus
GNU Affero General Public License v3.0
13.98k stars 1.89k forks source link

[OpenCV] YostarEN Auto Squad fix #10048

Open Constrat opened 2 months ago

Constrat commented 2 months ago

The problems you have encountered?

English OCR needs a different set of values. Currently using: 160, 2, 7, 0 instead of 140, 2, 0, 0. The current set has some downsides as the top row of operators can't be recognized in Auto Squad ref #10009 (And many other issues by me).

Logbook - #5004 - #5009 - #5235 - #5549 - #5745

I was wondering if there a was a way to bypass MAA's tasks and just give OpenCV some image files to see what the best combination of values would be. I tried reading a bit of documentation, but the operations done on images are stupidly mathematical (makes me remember I still have some exams to give at uni) and no examples at all.

Question: is there a way to use the OCRerConfig::set_bin_threshold etc directly (or at least extract what MAA "sees"?) Because currently I'm literally just throwing random numbers around the "default" values, and the results seem quite random, as there's definitely a correlation between the binarization threshold and the binary expansion, but I just can't see it and it makes it quite impossible.

https://github.com/MaaAssistantArknights/MaaAssistantArknights/blob/58f44d8fb83a111248a80a7d639cb6d98656348c/src/MaaCore/Vision/Config/OCRerConfig.h#L22-L26

Constrat commented 2 months ago

save_img method from VisionHelper.cpp is not enough as my understanding of that method is that it will just take a screencap of the entire screen the instant it's called, so calling it like:

            save_img(utils::path("debug") / utils::path("other"));
//or
            name_analyzer.save_img(utils::path("debug") / utils::path("other"));

Won't change anything

Constrat commented 2 months ago

(Also this happens when using debug x64 and the copilot tab) image

Constrat commented 2 months ago

I FINALLY DID IT

Constrat commented 2 months ago

@ajk992 image 2024-08-03_18-20-32-826_bin_before_trim As you can see MAA can't really see a lot lol

Constrat commented 2 months ago

Later update: I don't think we can actually reach a useful solution. Since MAA downscales images to 720p, there's a lot of antialiasing which should help but in this case it's actually a pain in the ass as the binarization completely fucks it up.

Need to look at alternatives' thresholding methods, perhaps adaptive, instead of binary.

cc @zzyyyl, since Misteo doesn't appear as online as it used to be and considering these are pretty deep in depth MAA operations, I'll try and have a look to see if I can somehow implement an alternative to the current binarization which seems pretty limited (btw all the special params (threshold and trimming) are completely useless most times as the use_raw is set to true by default)

I'm very open to suggestions as this seems a pretty big and important "sub project"

Constrat commented 2 months ago

EDIT: nvm, the use_raw is not necessarily needed as the params are still passed to the ocr_analyzer somehow:

    auto config = m_params;
    config.without_det = true;
    ocr_analyzer.set_params(std::move(config));

This makes me ask why is this here in the first place. If we WANT to use RAW, we will still use the m_params as shown above, but if we want to go with the modified, we'll basically apply the parameters twice(?)

    OCRer ocr_analyzer;
    if (m_use_raw) {
        ocr_analyzer = OCRer(m_image, new_roi);
    }
    else {
        cv::Mat bin3;
        std::array arr_bin3 { bin, bin, bin };
        cv::merge(arr_bin3, bin3);
        ocr_analyzer = OCRer(bin3, bounding_rect);
    }
Constrat commented 2 months ago

The rabbit hole is too deep, and I surrender. Just for fun at this point I've been trying denoising and sharpening, but the image is too small, the biggest problem is the huge drop in resolution. And that can't be fixes as there's literally no space in the image itself. Sure it's a bit noticeable but OCR wise the result doesn't change a bit, and image binarization itself is too aggressive (because of the generated antialiasing of the downscale) Before image image After

    // Apply denoising
    cv::Mat denoised_image;
    cv::fastNlMeansDenoisingColored(img_roi, denoised_image, 3, 3, 7, 21);
    cv::imwrite(debug_path + "2.png", denoised_image);

    // Convert to grayscale
    cv::Mat img_roi_gray;
    cv::cvtColor(denoised_image, img_roi_gray, cv::COLOR_BGR2GRAY);
    cv::imwrite(debug_path + "3.png", img_roi_gray);

    // Apply sharpening
    cv::Mat bin;
    cv::GaussianBlur(img_roi_gray, bin, cv::Size(0, 0), 1.5);
    cv::addWeighted(img_roi_gray, 1.3, bin, -0.3, 0, bin);
    cv::imwrite(debug_path + "4.png", bin);
Constrat commented 2 months ago

After a lot of trials and error and automatically coloring away the Rhodes Island logo, the current structure of MAA does not allow a good enough OCR for small texts and at the same time, detecting 3 stars. 2024-08-04_13-34-04-8511 2024-08-04_13-34-04-8513 Unfortunately MAA reads all kind of . - and other various symbols, as the OCR can't remains the same, the stupid interface is made literally in the worse way possible to not allow a solution, at least on my end.

Some more OCR retraining might even be necessary but at this point I don't really any other solutions. First problem being the downscale being done on the image itself especially in OCR environment

zzyyyl commented 2 months ago

I think an effective solution to this problem would be to OCR the original image, instead of scaling it to 720p and then recognizing it.

But that seems to involve such deep changes that it's not really feasible :(

Constrat commented 2 months ago

I did some tests, by tweaking the steps / threshold etc etc, but the issue is the 3 stars Rhodes Island logo. Never was able to remove it while improving the OCR itself. I can remove the Rhodes Island logo, but too many compromises, removing too many white pixels.

2024-08-04_11-35-49-0721 2024-08-04_11-35-49-0722 2024-08-04_11-35-49-0723 2024-08-04_11-35-49-0724 2024-08-04_11-35-49-0725 2024-08-04_11-35-49-0727

Adaptive thresholding is good, but the text is too small and without spaces to work grealtly. This was the last try I did with these: denoise grayscale adaptive contrast morphological.

The apadtive definitely works, but it's actually too good, as it picks up the RHODES ISLAND

zzyyyl commented 2 months ago

The Rhodes Island logo will disappear after current mask [140, 255] image

Constrat commented 2 months ago

I was trying a lot of different algorithms, so I was trying to change everything. Mask only unfortunately doesn't seem to cut it. The chinese english is much bigger :P. Guess we'll have to go with what we have for now.

zzyyyl commented 2 months ago

I mean, we can do a mask after all the algorithms to hide the Rhodes Island logo :D

Constrat commented 2 months ago

Threshold the threshold? Wouldn't it make the same thing, completely killing the Aliasing? I think that's the main reason sometimes it fails. Especially for the english one, I don't think a pure black and white threshold cuts it. It's good for short names, and bad for long names, as the characters gets mushed together.

zzyyyl commented 2 months ago

I got image

image

Constrat commented 2 months ago

image These are all the test I was making debug.zip

Constrat commented 2 months ago

I got image

image

Looks pretty good, but I'm not sure about the Alters. If you have a branch, I would like to check it out.

Don't think the OCR likes the way there's no space in between the letters (the antialiasing gets removed)

zzyyyl commented 2 months ago

I use the opencv-python

debug-save

    image = cv2.imread(maa_dir / "test" / "dist" / "formation-test-2.png") # the first screenshot in #10009
    image = cv2.resize(image, (1280, 720), 0, 0, cv2.INTER_AREA)

    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    msk = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
    image = cv2.bitwise_and(image, image, mask=msk)

    image = cv2.fastNlMeansDenoisingColored(image, 10, 10, 7, 21)
    img_bin = cv2.GaussianBlur(image, (0, 0), 1.5)
    cv2.addWeighted(image, 1.3, img_bin, -0.3, 0, img_bin)

    msk = cv2.inRange(image, (60,60,60), (255,255,255))
    image = cv2.bitwise_and(image, image, mask=msk)

    cv2.imwrite("debug-save.png", image)
    plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
    plt.show()
Constrat commented 2 months ago

It's identical to cpp opencv yes? It's a C++ wrapper :)