RicterZ / nhentai

nhentai doujinshi downloader
http://nhentai.net
MIT License
846 stars 120 forks source link

Bug fixes & improve duplicate checks #342

Closed normalizedwater546 closed 2 months ago

normalizedwater546 commented 2 months ago

Traceback (most recent call last): File "\?\A:\nhentai.venv\Scripts\nhentai-script.py", line 33, in sys.exit(load_entry_point('nhentai==0.5.7', 'console_scripts', 'nhentai')()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "A:\nhentai.venv\Lib\site-packages\nhentai-0.5.7-py3.12.egg\nhentai\command.py", line 114, in main generate_pdf(options.output_dir, doujinshi, options.rm_origin_dir, options.move_to_folder) File "A:\nhentai.venv\Lib\site-packages\nhentai-0.5.7-py3.12.egg\nhentai\utils.py", line 243, in generate_pdf pdf_f.write(img2pdf.convert(full_path_list, rotation=img2pdf.Rotation.ifvalid)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "A:\nhentai.venv\Lib\site-packages\img2pdf.py", line 2733, in convert ) in read_images( ^^^^^^^^^^^^ File "A:\nhentai.venv\Lib\site-packages\img2pdf.py", line 1829, in read_images raise ImageOpenError( img2pdf.ImageOpenError: cannot read input image (not jpeg2000). PIL: error reading image: cannot identify image file <_io.BytesIO object at 0x0000026F22120720>


- When a `.cbz` file is already generated, duplicate downloads will be ignored even when the flags are different (i.e. `--pdf` instead of `--cbz`).
   - This was caused by a [hard-coded `.cbz` file extension check](https://github.com/RicterZ/nhentai/blob/dec3f44542d601641f88cf6a642f6d653eaffba6/nhentai/downloader.py#L125-L128).
- De-duped file exists checks into helper method for more consistent behavior.
  - Logger message is done in parent function rather than helper to maintain logger context.
- [Removed warning when folder already exists.](https://github.com/RicterZ/nhentai/blob/dec3f44542d601641f88cf6a642f6d653eaffba6/nhentai/downloader.py#L137-L138)
  - Nothing is wrong with this. Proceed silently.
- Optimized path call in generating filename path at [1](https://github.com/RicterZ/nhentai/blob/b51e812449c7fed53e8dfa0da31a1729f6db7b64/nhentai/utils.py#L176) and [2](https://github.com/RicterZ/nhentai/blob/b51e812449c7fed53e8dfa0da31a1729f6db7b64/nhentai/utils.py#L215-L218)
  - `os.path.join(doujinshi_dir, '..')` was essentially equivalent to `output_dir`

Not sure what the expected behavior [should be here](https://github.com/normalizedwater546/nhentai/blob/a05a308e71f6a6532331128418591343fc622422/nhentai/downloader.py#L137-L139). I assumed this should continue with the rest of the process.
RicterZ commented 2 months ago

Some problems still exists:

  1. If --pdf --cbz option specified, only generate cbz file.
  2. If a cbz file exists, will ignore --pdf option.
  3. start_download method is coupled with too many options such as regenerate_cbz, file_type, maybe optimization is needed