LuteOrg / lute-v3

LUTE = Learning Using Texts: learn languages through reading. Python/Flask.
MIT License
347 stars 39 forks source link

Add .mobi support #306

Open Adriann25 opened 4 months ago

Adriann25 commented 4 months ago

Aside from .epub and .pdf, .mobi is also a popular and more compact extension for books.

It would be great if .mobi would be added as an alternative to .epub, as there are quite a few books that do not work in .epub and converting them can have unwanted effects especially on languages that use ruby characters.

A python library that supports unencrypted .mobi files is this one https://pypi.org/project/mobi/

Alternatively, there is also this project which is slightly more outdated https://pypi.org/project/mobi-python/

jzohrab commented 4 months ago

Hi @Adriann25 -- can you provide a very small .mobi file we could use as a test case? Like a file with a single sentence.

jzohrab commented 4 months ago

Dev size est:

jzohrab commented 4 months ago

Need to find a python .mobi parser project with an MIT license, the license linked in the issue is GPL.

Adriann25 commented 4 months ago

Need to find a python .mobi parser project with an MIT license, the license linked in the issue is GPL.

Hello. Have you checked the alt link I provided? It should have MIT license according to its description: https://pypi.org/project/mobi-python/

There are also several forks to look into if the base one is not working properly: https://github.com/kroo/mobi-python/network/members

jzohrab commented 3 months ago

Merged into develop, thank you @imamcr 👍

jzohrab commented 3 months ago

I've backed this out of the develop branch due to concerns about the lxml library dependency this introduces. Comments in the git commit log:

commit d0267959bcd9fcc69dd278c7a26de20a81ae03b2 (HEAD -> develop, origin/develop)
Merge: 9fc5af8 bc552b1
Author: Jeff Zohrab <jzohrab@gmail.com>
Date:   Fri Mar 15 11:20:57 2024 +0700

    Merge branch 'remove_mobi_support' into develop

    Reverts the commits of PR 338.  That PR introduced a dependency on lxml,
    which I don't understand enough about to make a good decision.
    Pip install of lxml works on my mac, but it was slow, and so am
    concerned about issues with other users' systems.

I backed it out by reverting the individual commits:

commit bc552b139285df9f28b4221afbc2bf632cb24fa8 (remove_mobi_support)
    Revert "Update requirements.txt and pyproject.toml"
    This reverts commit 60b0db7198b0e728d216d306b85706fbd2382c09.

commit 83fa4812474d7c9b5e9698fd0de49b80d558da13
    Revert "Add tests"
    This reverts commit 6267243bd17ea7017164f71ffc34deefbeac8e26.

So this will go back into the "to-do" part of the backlog. :-/

In the meantime, I think that users can at least convert their mobi files to text or epub or similar using online services.

jzohrab commented 3 months ago

On hold, low priority, and have to find a decent library that is MIT and doesn't pull in lxml (that's my preference, have read too many posts about it being problematic).