Artikash / Textractor

Extracts text from video games and visual novels. Highly extensible.
GNU General Public License v3.0
2.08k stars 202 forks source link

Trouble with Paradox/Clausewitz Engine games #410

Open BlindGuyNW opened 3 years ago

BlindGuyNW commented 3 years ago

Hi there,

I'm trying to extract text from Paradox Interactive/Clausewitz Engine games (Crusader Kings , Stellaris, etc.) They use a lot of text for tooltips and such. I'm totally blind, and have been struggling with OCRing these games until recently.

TextTractor's hooks don't seem to catch the text at all. I've tried running both as admin and a regular user, and had more success with the former, though the hooks were not actually catching the text as it appeared in real-time. I've had similarly disappointing results with all Paradox games I've tried thus far.

Any help that you could give here would be greatly appreciated. I feel like something obvious is  being overlooked, possibly by me :) I tried running a text search as outlined in the FAQ, and got a ton of hits, none of which was entirely what I wanted.

I realize you're not really interested in developing this program any further, but would truly appreciate any help you could provide. The games are a lot of fun, and my situation is kind of unique.

Thanks much for your consideration.

Artikash commented 3 years ago

You can try another attempt or two at searching for hooks, but there likely isn't a good solution here. It sounds like the text is pre-rendered and not processed in any way when activating the tooltip, so there's no way to extract text as the tooltip is displayed.

BlindGuyNW commented 3 years ago

Hi,

Thanks for getting back to me. I'm trying to understand what you mean by pre-rendered in this case. The text definitely is dynamic, there are variable values in the tooltips and so forth.

I will try and fiddle with this some more. I'm baffled that the Textractor hooks only seemed to be into kernel32 functions for character conversion and such, and none into anything in the game binary itself.

Thanks for getting back to me.

On Oct 5, 2020, at 8:40 AM, Akash Mozumdar notifications@github.com wrote:



You can try another attempt or two at searching for hooks, but there likely isn't a good solution here. It sounds like the text is pre-rendered and not processed in any way when activating the tooltip, so there's no way to extract text as the tooltip is displayed.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Artikash/Textractor/issues/410#issuecomment-703714081, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANWY43LYMS7VTIBA63FKJDSJHSFLANCNFSM4R6RQBMA.

TamaBaka commented 3 years ago

I think the prerender comes in because Textractor isn't designed to catch mouseovers. So you can throw all the mouseover events you want but the text that's coming out isn't going to update if there's no actual text changes reflected on the hook. For instance, when the text is rendered onto a graphics bitmap, and then future calls just reference the cached graphic.

For your kernel32 question, the game binary stores the text strings but how it actually pushes the text out to display is by using Windows functions like the kernel32 functions. That's a common and reliable method that can be automated. And it works just fine for sequential text displays like in the main text box of Visual Novels as the same method is reused for each new grouping of text. You're basically having problems because your tooltips are randomly displayed in their own textbox. You'd basically need a hook for every single one.

Yes, if you want a single hook, you would need to look at the game binary. But then you need to start factoring in edge cases like compression, endianness, character encoding, etc. That's basically something you will have to find yourself since Textractor was designed as a general purpose extractor. It doesn't specialize in Paradox games.