[Feature Request] Lazier import

bmfrosty commented 3 months ago

Request for an even lazier import - this time just grabbing the types of information that would be available by the stat command upon initial import.

  File: comicbook.cbz
  Size: 158935131   Blocks: 310424     IO Block: 4096   regular file
Device: 33h/51d Inode: 12666373952055318  Links: 1
Access: (0666/-rw-rw-rw-)  Uid: (99)   Gid: (100)
Access: 2023-12-08 07:02:51.571126986 -0800
Modify: 2022-01-11 03:45:55.633090223 -0800
Change: 2024-02-23 12:57:24.215549009 -0800
 Birth: -

The concept being that import would be very quick, and further information would be gathered when browsing like covers, or upon hitting the generate covers button. For initial import only the filename, size, and whatever date makes sense would be gathered.

ajslater commented 3 months ago

I'm unsure of the use case here. How often would someone need to re-import a very large collection where calculating just page size would be a burden? Surely even with half a million comics it would only take a couple minutes.

bmfrosty commented 3 months ago

Several hours is my finding.

2024-03-21 12:33:17 PDT DEBUG   Read Book Tags: 0/116254
2024-03-21 12:33:22 PDT INFO    Read Book Tags: 116/116254
2024-03-21 14:56:02 PDT INFO    Read Book Tags: 116136/116254

2024-03-20 15:28:01 PDT DEBUG   Read Book Tags: 0/159874
2024-03-20 15:28:06 PDT INFO    Read Book Tags: 17/159874
2024-03-21 12:08:04 PDT INFO    Read Book Tags: 159855/159874

On Thu, Mar 21, 2024 at 2:17 PM AJ Slater @.***> wrote:

I'm unsure of the use case here. How often would someone need to re-import a very large collection where calculating just page size would be a burden? Surely even with half a million comics it would only take a couple minutes.

— Reply to this email directly, view it on GitHub https://github.com/ajslater/codex/issues/349#issuecomment-2013754901, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAHYDWXJ7ZCHUKZW7QVGADYZNE6VAVCNFSM6AAAAABFALORV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJTG42TIOJQGE . You are receiving this because you authored the thread.Message ID: @.***>

ajslater commented 3 months ago

Yeah, you're right, of course. I did some tests of my own and:

Scan All Metadata

Imported 17876 comics at 60.1 comics per second.

Scan No Metadata, but get page_count and file_type:

Imported 17876 comics at 272.3 comics per second.

Scan No Metadata at all

Imported 17875 comics at 523.4 comics per second.

Codex v1.5.8 now does the "Scan No Metadata At All" version when the Import Metadata Admin Flag is disabled. page_count & file_type are necessary for the reader to function correctly, so if they reader finds that they are empty it scans those comics (prev book, current book, next book) inline for just those attributes. It also kicks off a full metadata scan of those comics so it never has to do that again.

ajslater / codex

[Feature Request] Lazier import #349