gvellut / jncep

Command-line tool to generate EPUB files for J-Novel Club pre-pub novels
GNU General Public License v3.0
43 stars 12 forks source link

Downloaded EPUB files are partially corrupted #61

Closed jzkmath closed 5 days ago

jzkmath commented 1 week ago

Describe the bug A clear and concise description of what the bug is. If applicable, copy the error message output by jncep at the end (in red usually).

After version 48, downloaded EPUB files have errors that prevent them from being processed by Google Play Books. Newly uploaded books will sit on the "processing" phase for an extended time period and then ultimately fail. Workaround currently is to use calibre to convert to MOBI and then convert back to EPUB to fix corruption

Expected behavior A clear and concise description of what you expected to happen.

Downloaded EPUB files using jncep should process fine when uploaded to google play books.

To Reproduce Write the specific jncep command you used. If you used environment variables to set the options, also indicate them here.

Take care to hide your J-Novel Club credentials! This issue will be visible to everyone.

use jncep to download an epub file such as Demon Lord, retry Volume 9 part 2. Attempt to upload this epub to google play books.

Debug trace If applicable, copy the debug trace of the execution. It can be obtained by passing the -d switch to jncep i.e. jncep -d .... If the trace is big, copy it to a text file and add that file as attachment to this issue (by drag-and-dropping the file on this text box).

The email (J-Novel Club login) will appear in the trace, so delete it if you don't want to make it visible.

N/A

Running an epub checker I get the following information:

epubcheck Demon_Lord_Retry_Volume_9_Part_2.epub 
Validating using EPUB version 3.2 rules.
ERROR(OPF-029): Demon_Lord_Retry_Volume_9_Part_2.epub(-1,-1): The file "EPUB/i_cdn_j_novel_club_pub_img_1200_webp_01J_A_B_3681VEX0ECHF4280GB14B.jpg" does not appear to match the media type image/jpeg, as specified in the OPF file.
ERROR(PKG-021): Demon_Lord_Retry_Volume_9_Part_2.epub/EPUB/i_cdn_j_novel_club_pub_img_1200_webp_01J_A_B_3681VEX0ECHF4280GB14B.jpg(-1,-1): Corrupted image file encountered.
ERROR(OPF-029): Demon_Lord_Retry_Volume_9_Part_2.epub(-1,-1): The file "EPUB/i_cdn_j_novel_club_pub_img_1200_webp_01J_A_B_3770C0N8B6RK4EPPCP9AC.jpg" does not appear to match the media type image/jpeg, as specified in the OPF file.
ERROR(PKG-021): Demon_Lord_Retry_Volume_9_Part_2.epub/EPUB/i_cdn_j_novel_club_pub_img_1200_webp_01J_A_B_3770C0N8B6RK4EPPCP9AC.jpg(-1,-1): Corrupted image file encountered.
ERROR(OPF-029): Demon_Lord_Retry_Volume_9_Part_2.epub(-1,-1): The file "EPUB/i_cdn_j_novel_club_pub_img_1200_webp_01J_A_B_380PX4S5ZZYXRCPB7ZKEM.jpg" does not appear to match the media type image/jpeg, as specified in the OPF file.
ERROR(PKG-021): Demon_Lord_Retry_Volume_9_Part_2.epub/EPUB/i_cdn_j_novel_club_pub_img_1200_webp_01J_A_B_380PX4S5ZZYXRCPB7ZKEM.jpg(-1,-1): Corrupted image file encountered.

Check finished with errors
Messages: 0 fatals / 6 errors / 0 warnings / 0 infos

EPUBCheck completed

Given this information, I am guessing that jncep is generating epub files that use WEBP instead of JPG or PNG which is breaking the epub file.

Environment (please complete the following information):

gvellut commented 1 week ago

Thank you for the bug report.

I think you are right: The image has a .jpg extension but is now actually webp (like with https://cdn.j-novel.club/pub/img/1200/webp/01J/A/D/21WRCKSKBFEVTGR3GCF43.jpg). So I think there will be a need to perform a conversion (it seems my kobo cannot display them and it's possibly the same for other devices).

gvellut commented 1 week ago

There is a link to some doc about the image URLs https://github.com/gvellut/jncep/issues/54#issuecomment-2408226450 It seems one can use a crafted URL to get the image in JPEG format instead of Webp: for example, for the URL above:

Webp (inside the original content): https://cdn.j-novel.club/pub/img/1200/webp/01J/A/D/21WRCKSKBFEVTGR3GCF43.jpg

JPEG (by slightly changing the URL): https://cdn.j-novel.club/pub/img/1200/jpg/01J/A/D/21WRCKSKBFEVTGR3GCF43.jpg

gvellut commented 5 days ago

Version 49 (just released) should fix the issue. Let me know if not the case. Commit: 043a0459748437100205055be3f25c17266fa032