apprenticeharper / DeDRM_tools

DeDRM tools for ebooks
14.48k stars 1.5k forks source link

Python3 support #979

Open norbusan opened 4 years ago

norbusan commented 4 years ago

Kovid Goyal announced that Calibre will switch to Python3 in the near future, and plugins need to be adjusted to work with Python 3: https://www.mobileread.com/forums/showthread.php?t=325721

Are there any plans or activities to port this plugin to Python 3?

j-howell commented 4 years ago

I see you have already produced a pull request related to this. (Py3 conversion #931)

I think that the bigger issue here is the lack of any involvement in the project by @apprenticeharper since last spring. There are currently more than a dozen open pull requests addressing various problems with DeDRM. I am concerned that this project may have been abandoned.

norbusan commented 4 years ago

Indeed, I realized that I had worked on that, but recently with the switch to Py3 it is getting more urgent for me and restart working on it. For now I try to get the obok working, later I will look into the other, as far as my need goes.

j-howell commented 4 years ago

Thanks for your efforts.

norbusan commented 4 years ago

Oh how I hate Python2 -> Python3, the worst event in computing history, thanks Python devs.

ElDavoo commented 4 years ago

It's been years since Python devs started encouraging porting stuff to py3 yet people still use py2 (like the main calibre dev), so you should thank those people instead.

norbusan commented 4 years ago

I disagee @ElDavoo but this is not the right place to discuss this.

ElDavoo commented 4 years ago

You're right. I was originally going to write about py3 support for dedrm, but I only saw the new branch after writing the comment here lol

Il dom 26 gen 2020, 22:10 Norbert Preining notifications@github.com ha scritto:

I disagee @ElDavoo https://github.com/ElDavoo but this is not the right place to discuss this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apprenticeharper/DeDRM_tools/issues/979?email_source=notifications&email_token=ABYBHYE5ZCASR5BEBD6OLPLQ7X335A5CNFSM4KF265YKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ5565I#issuecomment-578543477, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYBHYD2LLIYYHPLC3RPN2LQ7X335ANCNFSM4KF265YA .

yeupou commented 4 years ago

Any news? Now calibre based on python3 is distributed in major distros. So DeDRM is longer installable.

lalmeras commented 4 years ago

I do some work on this topic here : https://github.com/lalmeras/DeDRM_tools/tree/Python3

With this code I can perform :

The change I made are :

I think it still needs a lot of work :

Some questions on my side to help me continue this work :

innir commented 4 years ago

Works for me for ADE files. I had to install python3-crypto and I had to remove these three lines:

https://github.com/lalmeras/DeDRM_tools/blob/Python3/dedrm_src/zipfilerugged.py#L280-L282

Running file type plugin DeDRM failed with traceback:
Traceback (most recent call last):
  File "calibre_plugins.dedrm.__init__", line 209, in ePubDecrypt
  File "calibre_plugins.dedrm.zipfix", line 146, in fix
  File "calibre_plugins.dedrm.zipfix", line 41, in __init__
  File "calibre_plugins.dedrm.zipfilerugged", line 280, in __init__
TypeError: must be str, not bytes
madhatter0 commented 4 years ago

I'm keen to test the obok plugin, and am running calibre 4.13.0 on Fedora 32. In your opinion, should it currently be ready to try?

lalmeras commented 4 years ago

I never used Obok, and I can't test it as Windows Kobo client is needed to use it (Linux desktop here).

I just check my branch, and it seems that obok_src source files are python3 compliants but it may remain some byte/string related issues. I think it is worth a try.

ElleKayEm commented 4 years ago

I think Obok also works with Kobo ereader devices. Haven't tried it myself.

madhatter0 commented 4 years ago

Obok certainly does work with Kobo devices, and (until calibre upgrade) I used it so myself. I am keen to do so again!

I've tested as follows, and please let me know if this seems stupid: I downloaded the zipfile of DeDRM_tools-Python3.zip from https://github.com/lalmeras/DeDRM_tools/tree/Python3, unpacked it, went into DeDRM_tools-Python3/obok_src/, zipped up ., launched calibre, installed plugin from the zipfile I'd just made, restarted calibre (v4.13.0).

Errors logged:

Traceback (most recent call last):
  File "/usr/lib64/calibre/calibre/gui2/ui.py", line 156, in __init__
    ac = self.init_iaction(action)
  File "/usr/lib64/calibre/calibre/gui2/ui.py", line 170, in init_iaction
    ac = action.load_actual_plugin(self)
  File "/usr/lib64/calibre/calibre/customize/__init__.py", line 613, in load_actual_plugin
    ac = getattr(importlib.import_module(mod), cls)(gui,
  File "/usr/lib64/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "/usr/lib64/calibre/calibre/customize/zipplugin.py", line 184, in load_module
    exec(compiled, mod.__dict__)
  File "calibre_plugins.obok_dedrm.action", line 24, in <module>
  File "/usr/lib64/calibre/calibre/customize/zipplugin.py", line 184, in load_module
    exec(compiled, mod.__dict__)
  File "calibre_plugins.obok_dedrm.dialogs", line 30, in <module>
  File "/usr/lib64/calibre/calibre/customize/zipplugin.py", line 184, in load_module
    exec(compiled, mod.__dict__)
  File "calibre_plugins.obok_dedrm.utilities", line 10, in <module>
ModuleNotFoundError: No module named 'StringIO'
norbusan commented 4 years ago

I have been working a bit on obok py3 support, and got it running, but the decryption does not work. The problem is bytes versus strings and one needs to be very careful, since Py3 is very picky about that.

norbusan commented 4 years ago

Concerning my changes, see the py3 branch in https://github.com/norbusan/DeDRM_tools/tree/py3 back then it did load into Calibre and could get executed, but decryption failed:

Running Obok DeDRM v6.5.4
DEBUG:   13.7 get_device_settings - device_path= /media/norbert/KOBOeReader/
Obok v3.2.4
Copyright © 2012-2016 Physisticated et al.
/tmp/tmplnonkwo7
DEBUG:   14.1 got kobodir /media/norbert/KOBOeReader/.kobo
Error parsing Kobo plist: no legacy user key found.
/bin/sh: 1: ipconfig: not found
Trouble retrieving keys with newer obok method.
Traceback (most recent call last):
  File "calibre_plugins.obok_dedrm.action", line 120, in launchObok
  File "calibre_plugins.obok_dedrm.obok.obok", line 412, in userkeys
  File "calibre_plugins.obok_dedrm.obok.obok", line 488, in __getuserkeys
TypeError: Unicode-objects must be encoded before hashing

There are other problems (ipconfig missing on newer systems etc), though...

lalmeras commented 4 years ago

Error encountered by @madhatter0 is in the dedrm part and linked to byte/string/StringIO/CStringIO changes with python 3.

As I already fix some of these issues in the Adobe DRM decryption part, I think I can try to rework my code to fix it.

I already fix this first issue ; can you give it a try @madhatter0 ?

If anyone provides a obok-targeted tutorial link, I can use a VM or my e-reader - if I can fix my usb connector problem - to work on this point.

Also interested if anyone have a clue on test automation.

madhatter0 commented 4 years ago

Plugin now installs and launches cleanly, button is shown in main UI. With kobo connected and button pressed, I get this error:

calibre 4.13  embedded-python: False is64bit: True
Linux-5.5.10-100.fc30.x86_64-x86_64-with-glibc2.2.5 Linux ('64bit', 'ELF')
('Linux', '5.5.10-100.fc30.x86_64', '#1 SMP Wed Mar 18 14:34:46 UTC 2020')
Python 3.8.2
Interface language: en_GB
Successfully initialized third party plugins: Obok DeDRM (6, 5, 4)
Traceback (most recent call last):
  File "calibre_plugins.obok_dedrm.action", line 96, in launchObok
  File "calibre_plugins.obok_dedrm.obok.obok", line 384, in __init__
  File "/usr/lib64/python3.8/tempfile.py", line 474, in func_wrapper
    return func(*args, **kwargs)
TypeError: a bytes-like object is required, not 'str'

I hope that's of some help. Let me know if I can test anything else.

lalmeras commented 4 years ago

Not a surprise. This is the kind of error @norbusan talks about. I know how to fix it but I don't know how to detect it as I'm not fluent with dedrm internals.

At work now. I'll work on a fix this evening, and as tomorrow is holiday, I'll try to setup an environment on a windows VM so I can investigate other issues.

madhatter0 commented 4 years ago

Thank you so much! I'm at your disposal when you're ready.

norbusan commented 4 years ago

So yes, that is the problem. I'm only testing it with a real device, but now with all this Py3 fumbling...

lalmeras commented 4 years ago

« Good news », it seems I solve my usb connector issue and I can reproduce the issue with Kobo e-reader. And I just realize that python 3 + Windows + Kobo PC app is not really an issue as I think that calibre is still released with python 2 on Windows.

ElleKayEm commented 4 years ago

I believe all versions at calibre-ebook.com are still Python 2. Windows one definitely is anyway.

lalmeras commented 4 years ago

@madhatter0 can you try with my updated branch ? It works for me with my ebook reader (27 books imported successfully).

norbusan commented 4 years ago

@lalmeras is your version supposed to work with Kindle books? I tried it and I found a lot of Python2-only stuff, and byte/str confusions? Could you let me know what is the status, so that I can see what needs fixing. Thanks

norbusan commented 4 years ago

@lalmeras I have rebased your work onto DeDRM 6.7.0 (which brings fixes for Obok), and I can confirm that I can import my books from my eReader on Linux. Yeah!

norbusan commented 4 years ago

Here is the link to my changes: rebase your branch on top of current release, fix parts of Mobi, fix release script https://github.com/norbusan/DeDRM_tools/tree/python3-rebased

madhatter0 commented 4 years ago

I'm pleased to report that norbusan's version works, and de-DRMs the Kobo books I bought during COVID lockdown. lalmeras, would you like me to explicitly test your version also? I'm happy to do so if it will give you useful information.

lalmeras commented 4 years ago

@norbusan thanks for your rebase and your tests. I check your branch and sync to it. I confirm that mobidedrm.py is in bad state and needs work.

If anyone else works on it (or another decryption process), my insight from my work on obok: mobidedrm.py can be called directly as it is a « main » program. SafeUnbuffered must be fixed as it breaks stdout output (replace isinstance(data,unicode): by isinstance(data,bytes):). It allows to test decryption without the tedious calibre extension uninstall/install/test process. For mobi, an ebook file and pids are needed.

@norbusan Do you succeed in fixing mobi ?

lalmeras commented 4 years ago

@madhatter0 no need to test my version ; I switch to the updated @norbusan branch myself. I can test on my side Adobe DRM stuff.

norbusan commented 4 years ago

Hi @lalmeras thanks! I hadn't more time today to work on it, hope that i can do a bit more testing tomorrow. If you can do some necessary changes, that would be great.

Maybe we should work on the same branch? Either I give you write access or you give me write access to our respective dedrm repos? Or we work in our branch and sync up regularly.

I think having a working obok is a great start, and with your knowledge and a bit of work from our sides we can get the mobi/kindle working, too.

Thanks

madhatter0 commented 4 years ago

And a big thank you to you both from the peanut gallery, on behalf of all kobo users: this work is very much appreciated.

lalmeras commented 4 years ago

@norbusan just check your branch for both adobe drm and obok and it's OK. I sync my branch on your work and invite you on my repository.

I'll check mobidedrm.py/k4mobidederm.py for obvious fixes. I don't have any kindle or mobi so I need some preparation to work on it (get an encrypted book, understand how to obtain key, ...). I'll notice here any work on it.

norbusan commented 4 years ago

@lalmeras Thanks a lot. I pushed preliminary work going through the code, testing each step and fixing bytes versus strings versus integers...

How do you use the mobi decryption on the command line?

norbusan commented 4 years ago

In particular, what about this Unbuffered you mentioned?

norbusan commented 4 years ago

With the last commit just now at least it runs through, but ends with "cannot decrypt", but not due to an exception:

===== DEBUG: bookPID =  b'EFyg6BBX' <class 'bytes'>
===== DEBUG: bookPID2 =  b'EFyg6BBXUP' <class 'bytes'>
===== DEBUG: kindlePID =  b'HEXSTR*' <class 'bytes'>
Found 2 keys to try after 0.0 seconds
Crypto Type is: 2
======= DEBUG good pids =  [b'EFyg6BBX', b'HEXSTR*']
DeDRM v6.7.0: Failed to decrypt with error: No key found in 2 keys tried.
lalmeras commented 4 years ago

SafeUnbuffered is a class (at the beginning of the file) used only when file is launched from command line (python mobidedrm.py <file> <pid>. If it is not fixed as proposed, error cannot be viewed as stdout is broken. It is not used when used as a calibre plugin.

You should obfuscate your pid in your previous messages.

On my side, I find somebody who can give me kindle books so that I can try to work on decrypt process, but there is no .mobi file on his e-reader. Where are the files to retrieve on a kindle device ? What is the file extension to search ?

norbusan commented 4 years ago

Thanks, I will check how to fix the SafeUnbuffered so that I can debug from the command line.

Obfuscate done, but well, it is my hardware device, you cannot do anything with the pid anyway - unless you break into my kindle account and get the books from there.

ametzler commented 4 years ago

lalmeras wrote

On my side, I find somebody who can give me kindle books so that I can try to work on decrypt process, but there is no .mobi file on his e-reader. Where are the files to retrieve on a kindle device ? What is the file extension to search ?

.kfx for the new less useful format, azw3 for KF8 (epubish), azw for mobi.

Newer Kindles will have .kfx for almost all books, but one can get the .azw3 version by downloading the (intended for transfer by USB) file from amazon.com.

lalmeras commented 4 years ago

Yes I obtain a azw3 file, but it seems I cannot compute pid from kindle serial number with kindlepid.py script.

I have no success with: (either with python2 + master or python3 + updated branch)

python DeDRM_plugin/mobidedrm.py <encryptedfile> <outfile> <pid>

If someone have success with this command on python2 + master, and provides me file and pid, it would be a big help so I can work on a fix.

lalmeras commented 4 years ago

Thanks to @madhatter0 who provides me working items for a python2 & command-line installation, I push a fix python 3 version.

It works for command line invocation python3 from.azw to.mobi <pid>.

It also works as a calibre extension for this use case : configure PID in « Mobipocket ebooks » and import azw file in calibre library.

Both Alfcrypto and python decrypt implementation works.

@norbusan can you check if this update fixes your issue ?

norbusan commented 4 years ago

Hi @lalmeras thanks for the fixes, but unfortunately it doesn't work in Calibre. It seems the PID computation is wrong. How do you compute the pid argument?

norbusan commented 4 years ago

Ok, I have fixed it now, see my last push. I had to install Calibre Python2 and DeDRM Python2, debug what is going on there, and found what was the problem ;-) More bytes/str incompatibilites it was. Anyway, with the current code I can use the plugin in Calibre/Python3 and can remove DRM from mobi/azw3 books. As before, I think the Obok also works.

Thanks!

norbusan commented 4 years ago

Maybe you could make a PR against the upstream Python3 branch so that these changes are fed back to upstream?

akaihola commented 4 years ago

It looks like apprenticeharper/master is based on the apprenticeharper/Python3 branch, but commit 92bf51bc8f201a2d5b1e8b90b8dc033606dbcfb0 seems to break things – it renames the /dedrm_src/ directory to /DeDRM_plugin/ but at the same time reverts Python 3 compatibility fixes done to .py files inside it.

Am I correct to assume that the Git history is now in a sense corrupt, and it takes some manual work to construct a good new Python 3 compatibility branch on top of master?

Also @norbusan the norbusan/python3-rebased branch you mentioned doesn't seem to exist anymore, but I see norbusan/Python3-norbert instead.

And both norbusan/master and norbusan/Python3-norbert seem to sit at the same commit as apprenticeharper/master, which does not include most of the Python 3 fixes already merged into apprenticeharper/Python3.

Is a working Python 3 compatibility branch still to be found somewhere else?

During past years, at my day job I've worked on making and keeping a few 100k lines of legacy Python code py2/3 compatible, so let me know if you've hit tricky problems, I may be able to help.

norbusan commented 4 years ago

Hi @akaihola yes there was some kind of flux. Now everyting is concentrated here https://github.com/lalmeras/DeDRM_tools/tree/Python3 This branch has working DeDRM for Kindle (mobi) and Kobo, but others might not work (quite surely).

So, if you need Py3, for now use the branch above.

lalmeras commented 4 years ago

@akaihola history summary

So for now, from my point of view, lalmeras/Python3 is the up-to-date branch on this topic, and is based on master. About the conflict you notice, I think already solves it when he performs its rebase.

I don't if there is a DeDRM maintainer here ? I think apprenticeharper/Python3 shoudl be replaced by our work - lalmeras/Python3 - as it is a way better baseline for any work on python 3 compatibility issue.

For the work that remains to be done, from my point of view:

Example for what is already known:

If anyone has some advice, my problem now is that it is very time-consuming to investigate on a new DeDRM service provider. As an example, I work almost a day on mobidedrm, but most of my time was wasted on trying to fix it without a python2 working example and searching how to reproduce a working python 2 case... Actually, I do not succeed (my Amazon Kindle Reader uses KFX despite I apply advice from DeDRM documentation) and it is @madhatter0 (thanks again) who sends me files so that I can work on the issue.

Once I get working resources for python 2, it takes me less than 30 minutes to fix the code.

For now, I choose to focus only on python 3 rewrite, and to work on py2/py3 compatiblity once all the fixes will be found. Good news you may help on this point @akaihola.

On my side, I'll read all the readme, faq, and header comments of the repository to get a better general point of view of dedrm architecture and feature. Not sure when I have time to do it.

innir commented 4 years ago

@lalmeras No, Adobe book decryption does not work for me, I just checked with your Python3 branch, I still get this error:

AlfCrypto not found. Using python PC1 implementation.
DeDRM v6.7.0: Trying to decrypt Brot backen, wie es nur noch wenige können.epub
DeDRM v6.7.0: Verifying zip archive integrity
DeDRM v6.7.0: Error 'must be str, not bytes' when checking zip archive

Running file type plugin DeDRM failed with traceback:
Traceback (most recent call last):
  File "calibre_plugins.dedrm.__init__", line 209, in ePubDecrypt
  File "calibre_plugins.dedrm.zipfix", line 146, in fix
  File "calibre_plugins.dedrm.zipfix", line 41, in __init__
  File "calibre_plugins.dedrm.zipfilerugged", line 280, in __init__
TypeError: must be str, not bytes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/calibre/calibre/customize/ui.py", line 172, in _run_filetype_plugins
    nfp = plugin.run(nfp) or nfp
  File "calibre_plugins.dedrm.__init__", line 642, in run
  File "calibre_plugins.dedrm.__init__", line 212, in ePubDecrypt
Exception: must be str, not bytes

Removing lines https://github.com/lalmeras/DeDRM_tools/blob/Python3/DeDRM_plugin/zipfilerugged.py#L280-L282 still fixes the problem for me.

lalmeras commented 4 years ago

@innir My bad, I misread your feedback. I'll update my branch to fix this issue.

lalmeras commented 4 years ago

@innir Just dig into your issue; it needs a bit more work to fix it and I think I understand why I do not encounter it.

I understand why removing the lines as proposed fixes your issue, but it cannot be considered as a good fix as it remove an intended filename cleanup.

Issue explained

zinfo.filename can be set both with str or bytes, as we can see in _decodeFilename and _encodeFilenameFlags [1] methods. It seems to be related with a flag in zipinfo.flag_bits.

Behavior is inconsistent in these methods:

I think I can reproduce the issue by manipulating flag_bits for one of my epub. Methods must be fixed to ensure zipinfo.filename type consistency.