Applying extract_jet.py to ru.dat

Caerbannog / ruzzlebot

A dictionary extractor, solver, and click simulator for Ruzzle on Android

Other

2 stars 2 forks source link

Applying extract_jet.py to ru.dat #1

Open afarber opened 10 years ago

afarber commented 10 years ago

Hello, Martin!

I am trying (using python-2.6.6 on CentOS 6 Linux) to apply your Python script to the Russian dictionary file ru.dat that I extract from RuzzleAdventure.app directory using iFunbox.

The script runs ok, but the resulting text file contains words in English letters (as if some bit has been masked off):

ABAGUR ABAGURA ABAGURAM ABAGURAMI ABAGURAX ABAGURF ABAGURNAh ABAGURNODO ABAGURNOF ABAGURNOJ ABAGURNOM ABAGURNOMU ABAGURNOg ABAGURNUg ABAGURNdF ABAGURNdJ ABAGURNdM ABAGURNdMI ABAGURNdX ...and so on (1152959 lines in total)...

Do you please happen to have an idea, how to get proper Russian output?

Best regards

Caerbannog commented 10 years ago

When I coded it last year I noticed your problem for nordic languages. You should just change line 45:

    letter = chr(input_f[index * 4])

Letter are coded with a single byte in the .jet, and I assume they are ASCII with a codepage depending on the selected language. For Russian you could try something like:

    letter = chr(input_f[index * 4]).decode('cp866')

Or alternatively guess the cyrillic letters corresponding to the ASCII form "ABAGURNdX" and hard code a translation table.

afarber commented 10 years ago

Thank you for your suggestion, but unfortunately "cp866", "cp1251" and "koi8-r" have produced exactly same output file...

Caerbannog commented 10 years ago

Allright they probably use a custom mapping. Here is a dirty solution. I haven't tried it fully. You should edit out_str with the letters that you want.

At the top of the file:

import string
in_str  = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
out_str = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
trans_table = string.maketrans(in_str, out_str)

At line 45:

    letter = chr(input_f[index * 4])
    letter = string.translate(letter, trans_table)