Open Constrat opened 2 months ago
save_img method from VisionHelper.cpp is not enough as my understanding of that method is that it will just take a screencap of the entire screen the instant it's called, so calling it like:
save_img(utils::path("debug") / utils::path("other"));
//or
name_analyzer.save_img(utils::path("debug") / utils::path("other"));
Won't change anything
(Also this happens when using debug x64 and the copilot tab)
I FINALLY DID IT
@ajk992 As you can see MAA can't really see a lot lol
Later update: I don't think we can actually reach a useful solution. Since MAA downscales images to 720p, there's a lot of antialiasing which should help but in this case it's actually a pain in the ass as the binarization completely fucks it up.
Need to look at alternatives' thresholding methods, perhaps adaptive, instead of binary.
cc @zzyyyl, since Misteo doesn't appear as online as it used to be and considering these are pretty deep in depth MAA operations, I'll try and have a look to see if I can somehow implement an alternative to the current binarization which seems pretty limited (btw all the special params (threshold and trimming) are completely useless most times as the use_raw
is set to true by default)
I'm very open to suggestions as this seems a pretty big and important "sub project"
EDIT: nvm, the use_raw is not necessarily needed as the params are still passed to the ocr_analyzer somehow:
auto config = m_params;
config.without_det = true;
ocr_analyzer.set_params(std::move(config));
This makes me ask why is this here in the first place. If we WANT to use RAW, we will still use the m_params as shown above, but if we want to go with the modified, we'll basically apply the parameters twice(?)
OCRer ocr_analyzer;
if (m_use_raw) {
ocr_analyzer = OCRer(m_image, new_roi);
}
else {
cv::Mat bin3;
std::array arr_bin3 { bin, bin, bin };
cv::merge(arr_bin3, bin3);
ocr_analyzer = OCRer(bin3, bounding_rect);
}
The rabbit hole is too deep, and I surrender. Just for fun at this point I've been trying denoising and sharpening, but the image is too small, the biggest problem is the huge drop in resolution. And that can't be fixes as there's literally no space in the image itself. Sure it's a bit noticeable but OCR wise the result doesn't change a bit, and image binarization itself is too aggressive (because of the generated antialiasing of the downscale) Before After
// Apply denoising
cv::Mat denoised_image;
cv::fastNlMeansDenoisingColored(img_roi, denoised_image, 3, 3, 7, 21);
cv::imwrite(debug_path + "2.png", denoised_image);
// Convert to grayscale
cv::Mat img_roi_gray;
cv::cvtColor(denoised_image, img_roi_gray, cv::COLOR_BGR2GRAY);
cv::imwrite(debug_path + "3.png", img_roi_gray);
// Apply sharpening
cv::Mat bin;
cv::GaussianBlur(img_roi_gray, bin, cv::Size(0, 0), 1.5);
cv::addWeighted(img_roi_gray, 1.3, bin, -0.3, 0, bin);
cv::imwrite(debug_path + "4.png", bin);
After a lot of trials and error and automatically coloring away the Rhodes Island logo, the current structure of MAA does not allow a good enough OCR for small texts and at the same time, detecting 3 stars.
Unfortunately MAA reads all kind of .
-
and other various symbols, as the OCR can't remains the same, the stupid interface is made literally in the worse way possible to not allow a solution, at least on my end.
Some more OCR retraining might even be necessary but at this point I don't really any other solutions. First problem being the downscale being done on the image itself especially in OCR environment
I think an effective solution to this problem would be to OCR the original image, instead of scaling it to 720p and then recognizing it.
But that seems to involve such deep changes that it's not really feasible :(
I did some tests, by tweaking the steps / threshold etc etc, but the issue is the 3 stars Rhodes Island logo. Never was able to remove it while improving the OCR itself. I can remove the Rhodes Island logo, but too many compromises, removing too many white pixels.
Adaptive thresholding is good, but the text is too small and without spaces to work grealtly. This was the last try I did with these: denoise grayscale adaptive contrast morphological.
The apadtive definitely works, but it's actually too good, as it picks up the RHODES ISLAND
The Rhodes Island logo will disappear after current mask [140, 255]
I was trying a lot of different algorithms, so I was trying to change everything. Mask only unfortunately doesn't seem to cut it. The chinese english is much bigger :P. Guess we'll have to go with what we have for now.
I mean, we can do a mask after all the algorithms to hide the Rhodes Island logo :D
Threshold the threshold? Wouldn't it make the same thing, completely killing the Aliasing? I think that's the main reason sometimes it fails. Especially for the english one, I don't think a pure black and white threshold cuts it. It's good for short names, and bad for long names, as the characters gets mushed together.
I got
I got
Looks pretty good, but I'm not sure about the Alters. If you have a branch, I would like to check it out.
Don't think the OCR likes the way there's no space in between the letters (the antialiasing gets removed)
I use the opencv-python
image = cv2.imread(maa_dir / "test" / "dist" / "formation-test-2.png") # the first screenshot in #10009
image = cv2.resize(image, (1280, 720), 0, 0, cv2.INTER_AREA)
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
msk = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
image = cv2.bitwise_and(image, image, mask=msk)
image = cv2.fastNlMeansDenoisingColored(image, 10, 10, 7, 21)
img_bin = cv2.GaussianBlur(image, (0, 0), 1.5)
cv2.addWeighted(image, 1.3, img_bin, -0.3, 0, img_bin)
msk = cv2.inRange(image, (60,60,60), (255,255,255))
image = cv2.bitwise_and(image, image, mask=msk)
cv2.imwrite("debug-save.png", image)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.show()
It's identical to cpp opencv yes?
It's a C++ wrapper :)
The problems you have encountered?
English OCR needs a different set of values. Currently using:
160, 2, 7, 0
instead of140, 2, 0, 0
. The current set has some downsides as the top row of operators can't be recognized in Auto Squad ref #10009 (And many other issues by me).Logbook
- #5004 - #5009 - #5235 - #5549 - #5745I was wondering if there a was a way to bypass MAA's tasks and just give OpenCV some image files to see what the best combination of values would be. I tried reading a bit of documentation, but the operations done on images are stupidly mathematical (makes me remember I still have some exams to give at uni) and no examples at all.
Question: is there a way to use the
OCRerConfig::set_bin_threshold
etc directly (or at least extract what MAA "sees"?) Because currently I'm literally just throwing random numbers around the "default" values, and the results seem quite random, as there's definitely a correlation between the binarization threshold and the binary expansion, but I just can't see it and it makes it quite impossible.https://github.com/MaaAssistantArknights/MaaAssistantArknights/blob/58f44d8fb83a111248a80a7d639cb6d98656348c/src/MaaCore/Vision/Config/OCRerConfig.h#L22-L26