Leseratte10 / acsm-calibre-plugin

Calibre plugin for ACSM->EPUB and ACSM->PDF conversion.
https://www.mobileread.com/forums/showthread.php?t=341975
GNU General Public License v3.0
600 stars 23 forks source link

Move all win32 logic to wine executable in linux ADE extraction #26

Closed melvyn2 closed 1 year ago

melvyn2 commented 2 years ago

Actions like reading the registry and serial number were being done in python even though the final decryption was done in wine. This commit moves all windows logic except architecture detection into the exe ran under wine to simplify the architecture.

There's also a commit which half-fixes the old python script, which still didn't work on my machine - it didn't detect the disk serial.

I've confirmed that the executables build and the plugin runs as expected on my machine.

Leseratte10 commented 2 years ago

Thanks for that PR. The reason why I had almost everything in Python code was that I first wanted to implement everything in Python code to extract the needed information from the WINEPREFIX. That way you could have extracted the key data from a WINEPREFIX even if it was corrupted or even if wine wasn't installed on the machine.

However, I was unable to properly get CryptUnprotectData to work in python, so that was done with a Windows executable, and I only did in C what wasn't possible in Python.

Looking at your PR, doing everything in the executable instead seems to be way cleaner (and is probably what I should have done in the first place). I'll have to check if that doesn't cause more AVs to falsely detect that executable as malicious - I've already had to modify the code a bunch of times to get fewer AVs to complain, and I'd rather have more complicated Python code instead of having the plugin marked as a virus by AVs.

I'm not sure what caused the issues with serial detection in the python code (it worked for me, and I believe I also tested this in Python 2, but I might be wrong), but if that ends up being done in the executable instead then that's not really something I need to spend time analyzing.

melvyn2 commented 2 years ago

Your code comments explained the situation pretty well and I understood why the code ended up as it did. You did note yourself that the serial sometimes wasn't detected right, in those comments, so it seems like it was a known issue. As for AV detection, VirusTotal shows no real issues so far: https://www.virustotal.com/gui/file/23d8f17d0788da109950ec1c3bbd8b6d43e9386fc379247df8278cbf9ed5217f https://www.virustotal.com/gui/file/75f6bcd0e2c66a130876c1a85099549681a9eddecbb985bd45ae6d8b7f0b7d9d Both files have 1 false positive but it doesn't look problematic as they aren't popular consumer AV's.

melvyn2 commented 1 year ago

Ooops, I did test the whole program, and it worked despite this snippet not doing it's job because the default assignment works. Fixed now. On Jul 13 2022, at 10:22 am, Florian Bach @.***> wrote:

@Leseratte10 commented on this pull request. In calibre-plugin/getEncryptionKeyLinux.py (https://github.com/Leseratte10/acsm-calibre-plugin/pull/26#discussion_r920329603):

@@ -332,27 +34,21 @@ def CryptUnprotectDataExecuteWine(wineprefix, data, entropy): if not line: break

  • stuff = re.match(r'#arch=(win32|win64)', line)
  • if (stuff):
  • winearch = stuff.groups()[0]
  • archkey = re.match(r'#arch=(win32|win64)', line)
  • if (archkey):
  • archkey = stuff.groups()[0] This still seems to access the variable with its old name, so I don't think that that's going to work. Was this a last-minute change or did you not test that part?
Leseratte10 commented 1 year ago

Thanks, but that fix seems to be incomplete. You're still accessing stuff which doesn't exist anymore since you renamed it. And the username part (see my other comment) also needs to be updated, unless I'm interpreting your code wrong.

melvyn2 commented 1 year ago

Fixed the wrong variable, but I didn't see any comments about the username? What about it needs to be changed?

Leseratte10 commented 1 year ago

Huh, I started a normal Github review for your PR and left these two comments. EDIT: I'm a dumbass and didn't properly save it that's why it didn't show up.

Copy-pasting from the review:

Where does this get the user name from? Some kind of Windows API? Then that's going to fail if you have a user "A", then authorize ADE (so the key is encrypted with user "A"), then you rename your user account to "B". The program needs to read the user name from the Windows registry, how it was during the authorization.

Or do I misunderstand what GetUserName is/does?

Leseratte10 commented 1 year ago

Thanks. Another thing I'd like to see tested is what happens if the Windows username (both for the registry key set by ADE and for the Windows API fallback) contains special characters like "脢" (extended ascii) or "投" (16-bit Unicode) or "馃憦" (32-bit unicode). I remember that it took me quite a while to get my Python code to work properly with usernames like this. I'm assuming it's probably easier in C because everything's a "char" (= byte) and C doesn't really care about Unicode or encoding, but still.

Unfortunately I don't remember how I did that test in the past. I would say just change username to "123脢投馃憦", authorize ADE, then see if the code works, but I think the Wine username is taken straight from Linux and I don't know if Linux likes emoji in usernames :P

I'll have to think about that and see if I can figure out how I did these tests when I implemented the Python-C-mixed version.

melvyn2 commented 1 year ago

Extended ascii will very likely work - the program can read and print the username if I set the adobe username key to n1茅脗, and so I think that it'll work to decrypt too (haven't tested it). However, it probably won't work with multi-byte characters/unicode, because when I try any (both 投 or 馃憦), they render as ? and ?? when printed. 123脢投馃憦 is represented as 313233ffffffca3f636c617 when I print it as hex. I'll see what happens using the W version of the apis.

melvyn2 commented 1 year ago

I don't think wide strings will work, since every character is 2 bytes and makes the decryption fail.

Leseratte10 commented 1 year ago

The Windows-native variant is implemented like this (see getEncryptionKeyWindows.py):

def GetUserNameWINAPI():
    from ctypes import windll, c_wchar_p, c_uint, POINTER, byref, create_unicode_buffer
    advapi32 = windll.advapi32
    GetUserNameW = advapi32.GetUserNameW
    GetUserNameW.argtypes = [c_wchar_p, POINTER(c_uint)]
    GetUserNameW.restype = c_uint

    buffer = create_unicode_buffer(32)
    size = c_uint(len(buffer))
    while not GetUserNameW(buffer, byref(size)):
        buffer = create_unicode_buffer(len(buffer) * 2)
        size.value = len(buffer)

    # Yes, it's actually implemented like that. Encode in UTF16 but only take the lowest byte of each character.
    return buffer.value.encode('utf-16-le')[::2]

def GetUserNameREG():
    try:
        import winreg
    except ImportError:
        import _winreg as winreg

    try: 
        DEVICE_KEY_PATH = r'Software\Adobe\Adept\Device'
        regkey = winreg.OpenKey(winreg.HKEY_CURRENT_USER, DEVICE_KEY_PATH)
        # Yes, it's actually implemented like that. Encode in UTF16 but only take the lowest byte of each character.
        userREG = winreg.QueryValueEx(regkey, 'username')[0].encode('utf-16-le')[::2]
        return userREG
    except: 
        return None

So, reading from the registry or the Windows API, encoding as UTF16 Little Endian, then only taking the lowest byte of each codepoint. I have definitely tested this with characters from the 16-bit range (like "投"), but I'm not sure if this implementation is correct for 32-bit characters like "馃憦".

EDIT: On Windows I tested this by actually renaming my user to a string that contained "投", and then authorizing ADE.

melvyn2 commented 1 year ago

This should mirror that decoding strategy now. I wasn't able to test it with anything other than my normal ASCII username, however.

Leseratte10 commented 1 year ago

Thanks for all the fixes. Now you're using sizeof(key_size) for the call to CryptUnprotectData which is going to be sizeof(DWORD) which is going to be 4 (or 8), not the actual key size - or am I missing something. It needs to be key_size - doesn't it?

melvyn2 commented 1 year ago

Nope, just me trying to fix things too quickly :/

melvyn2 commented 1 year ago

A better-fitting solution might be the zugbruecke package, which allows using ctypes in a wine environment. It hasn't updated in half a year but claims to be compatible with recent cpython and wine versions. This would potentially allow reusing the same python code that is used on native windows.

Leseratte10 commented 1 year ago

That sounds interesting. I tried to implement something like that a while ago; but I don't remember if it was Zugbr眉cke or another similar library. One issue I see with this solution would be that it requires Python to be installed in the WINEPREFIX, which may not be the case for everyone; and with all the necessary Python libraries I don't think that would be easy for everyone to get set up.

If it works properly it would be great, but I don't think it would work out-of-the-box for everyone; so the solution in this PR is probably still better.

melvyn2 commented 1 year ago

From what I can tell, the package actually sets up its own temporary python environment in the wineprefix. Unfortunately, this does mean it either has to download it on each use, or carry around a cached download which poses the same problem as distributing this binary (but larger).

Leseratte10 commented 1 year ago

Setting up its own Python environment sounds like something that could introduce even more errors. I think for now I'm going to merge this PR as-is, and take a look at Zugbr眉cke at a later time to see if that's something that could be useful.

I have just tested this PR in Wine and it works just fine. I did not explicitly test Unicode usernames in Wine as that would require me to rename my Linux user (or set up a whole new VM with Linux & Wine), but I'm assuming that if that part works on Windows it'll work on Wine, too.

I'm going to merge this PR and include it in 0.0.16 then. Thanks!

s-m-e commented 1 year ago

Hi, developer of zugbruecke here. I branched out the stuff that you were interested in, Python on Wine, into a separate package called wenv. zugbruecke delegates the management of Python on Wine to wenv almost entirely. wenv creates its own WINEPREFIX at a location of the user's choice, by default underneath share within the currently activated (Unix) Python virtual environment. This is not temporary - it is merely created on first use if the user does not command its creation explicitly. It is by default intentionally separated from any other WINEPREFIX the user might have configured. As part of the first-use setup, wenv also downloads a Windows build of Python and puts it into this dedicated WINEPREFIX. The download can be skipped if the package is pointed to a locally cached version. Since there is now a second Python interpreter, this time for Windows, wenv must keep things (binaries, packages, etc) for Unix and Windows/Wine separate. The is were the Windows Python environment underneath the WINEPREFIX kicks in. From a developer's perspective, the Unix Python interpreter can not see anything of what the Python interpreter on Wine is doing. The Python interpreter on Wine is configured with a simple pth file telling it where to find its own packages, yet also unable to see what the Unix Python interpreter is doing. From a user's perspective, if you run commands like python or pip, you are dealing with the Unix side of things. If you run commands prefixed with wenv, i.e. wenv python or wenv pip, you are dealing with the Windows/Wine side (i.e. Python for Windows and its own packages).

Leseratte10 commented 1 year ago

Thanks for your comment. Is there a way, or is it even a good idea, to have wenv use the user's already existing WINEPREFIX rather than having it create a new one?

The reason why we want to run Windows Python code on Linux through WINE for this plugin is to extract data from another Windows application running through WINE that lives in the user's WINEprefix. If wenv really needs its own prefix then that wouldn't work.

In the documentation I found the config parameter wineprefix to change wenv's wine prefix path, but I'm not sure if that can be used to make it use an existing wineprefix or if it's only intended to move the custom wineprefix to another location.

Also, it looks like wenv requires Python >= 3.7, that'd be old enough for Calibre 5.0 but it would mean I'd need to drop support for Calibre 4 and below. Wenv looks like a pretty cool think and I might use it for other projects, but for this plugin I think I'm going to stick with what's been implemented in this PR - a single EXE file that does all the key extraction so there's no need to run any Python through WINE.

s-m-e commented 1 year ago

Is there a way, or is it even a good idea, to have wenv use the user's already existing WINEPREFIX rather than having it create a new one? [...] In the documentation I found the config parameter wineprefix to change wenv's wine prefix path, but I'm not sure if that can be used to make it use an existing wineprefix or if it's only intended to move the custom wineprefix to another location.

It is perfectly possible. The default behavior of creating its own prefix is merely a safety precaution - not knowing what the user might have messed up in the existing prefix.

Also, it looks like wenv requires Python >= 3.7 [...]

This is mainly imposed by CI/testing and issues with TLS/SSL/libssl. Older versions of Python want older versions of libssl causing problems due to outdated certificates amongst other issues. The common workaround is to use certifi or to simply not request any stuff via HTTP(s). I have done this before. As far as the code goes, it should work down to Python 3.5. It is syntactically incompatible to 3.4.