Kareadita / Kavita

Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.
http://www.kavitareader.com
GNU General Public License v3.0
6.35k stars 331 forks source link

Series change to `(` after each scan #3013

Closed brettcol closed 5 days ago

brettcol commented 4 months ago

What happened?

Kavita is displaying a series with the title "(". During a scan, the files that it links to, and the cover, rapidly change. Each scan appears to result in a different set of files being linked to from the "(" 'series'.

What did you expect?

A consistent series.

Kavita Version Number - If you don not see your version number listed, please update Kavita and see if your issue still persists.

0.8.1 - Stable

What operating system is Kavita being hosted from?

Linux

If the issue is being seen on Desktop, what OS are you running where you see the issue?

Windows 11

If the issue is being seen in the UI, what browsers are you seeing the problem on?

Chrome

If the issue is being seen on Mobile, what OS are you running where you see the issue?

Android

If the issue is being seen on the Mobile UI, what browsers are you seeing the problem on?

Chrome

Relevant log output

https://drive.google.com/file/d/12yetzedm2BZB1xe6abRFs6fRMoRxe9mH/view?usp=sharing

Additional Notes

Kavita Info

Version 0.8.1.0 Install ID f9ee05ae

Machine Info

Linux 5.10.0-30-cloud-amd64 #1 SMP Debian 5.10.218-1 (2024-06-01) x86_64

brettcol commented 4 months ago

Reading through the logs, the program appears to be consistently changing the root folder for the series (, over-and-over (instead of making new series based on the folder/file names).

brettcol commented 4 months ago

Added debug: reinstalled Kavita from scratch and still had this result.

brettcol commented 4 months ago

Attempted debug by removing all non-ASCII characters from series folder titles and removing the series ( from the SQLite db. This allowed more folders to be found and properly turned into series, but at some point it remade the ( series and started putting things in there again.

majora2007 commented 4 months ago

Sorry to just get to this. So I see Konya mo Nemurenai v03 c14.zip is mapped to the ( folder. Can you tell me if there is any metadata (comicinfo) inside the first issue or even this file?

To also confirm, you are using LibraryType: Manga and a folder root of /home/koyomi/library/library/manga/ correct?

brettcol commented 4 months ago

Yes, library type Manga, folder root /home/koyomi/library/library/manga/

Konya mo Nemurenai v03 c14.zip: Property Value
Name Konya mo Nemurenai v03 c14.zip
Type zip Archive
File location /home/koyomi/library/library/manga/
Size 10595145 bytes
Entries 32
brettcol commented 4 months ago

image

Paramestus commented 3 months ago

Kavita is displaying a series with the title "(". During a scan, the files that it links to, and the cover, rapidly change

Hi guys, I used to used Kavita before, so I know I shouldn't be using it with any files on the root folder, but before I remembered that I had all my files inside the root, only 2 series where inside another folder, when I scanned for the first time it only scanned these two series, and then I put another one inside another folder and it scanned it, but after that, every new file I put inside a folder taking it out of the root, it re scanned it as if it were the same thrid folder, like it's overwritting this folder with a new series, this is not normal for Kavita scanner to behave like that this is the root, D:\DOU, this was scanned normally D:\DOU\Mating with Oni the same as this D:\DOU艶がり村 but things like this D:\DOU(C77) [Hito no Fundoshi] Admired Beautiful Flower Extra (Princess Lover!) (English) keeps getting overwrriten

https://github.com/user-attachments/assets/0481b11b-840d-4a03-aabf-e25b926ee083

Paramestus commented 3 months ago

What operating system is Kavita being hosted from? Windows 11

If the issue is being seen on Desktop, what OS are you running where you see the issue? Windows 11

If the issue is being seen in the UI, what browsers are you seeing the problem on? Edge

Capitiano commented 3 months ago

I am also experiencing this issue. I'll try to give as much detail as possible. Kavita is being hosted on TrueNas Scale. It's accessing files through a read-only SMB share. Previously I was using Kavita with no issues. I had everything set up with .png/.jpg files. I wanted to change everything to .cbz files so that I could use that system for tagging. As I was setting it up I noticed it would, seemingly randomly, not present some files in the Kavita library.

Things I tried to Troubleshoot: Renaming the .cbz files to exactly match the (series) folder Renaming the .cbz files to any combination of [File Name] + Chapter 01, v1 c1, etc. Turning on all file types under [Library Name] -> Settings -> Advanced -> File Types Re-adding the .png/.jpg files (would cause this series to re-appear but only the images folder and not the .cbz) Removing a series and re-scanning the library (would not re-appear) Renaming the folder to just ( (would add it right next to the other one) Changing the Libary Type to Images under [Library Name] -> Settings -> Advanced -> File Types (the ( thing would dissapear but the missing ones would not re-appear) Doing this created a new error There was an issue writing to the DB for Series (. Probably about 100 errors in the default log

DieselTech commented 2 weeks ago

The scanner has had a lot of work done around it with the latest update that is in the nightly channel now. It will be slower, but it should also be more accurate. Can everyone who was having this problem try out the nightly branch to see if the issue is still occurring?

Capitiano commented 2 weeks ago

Using the same files and naming as when I mentioned above, the series still get merged into ( but the behavior is slightly different. It is picking out random works and assigning them chapter and volume names. There are still ones missing.

That being said I was messing with it further and it does seem like the parser is getting stuck on the folder and file names. I managed to replicate the creation of this bug by slowly adding titles to a fresh install. Cleaning up the file names seems to help but if any 1 .cbz is incorrect it ends up hiding others inside of this ( bug.

majora2007 commented 2 weeks ago

I have a python script that can map out a library so I can reproduce. Can you run this on the smallest example possible and give me the json output (you can email/dm me on discord if sensitive).


import os
import json

def map_files(root_dir):
    files_map = []

    for dirpath, dirnames, filenames in os.walk(root_dir):
        # Skip directories that start with "."
        dirnames[:] = [d for d in dirnames if not d.startswith('.')]

        for filename in filenames:
            # Skip files that start with "."
            if not filename.startswith('.'):
                # Get the relative path of the file
                relative_path = os.path.relpath(os.path.join(dirpath, filename), root_dir)
                files_map.append(relative_path)

    # Export the map to a JSON file
    with open('files_map.json', 'w') as outfile:
        json.dump(files_map, outfile, indent=4)

if __name__ == "__main__":
    root_dir = os.getcwd()
    map_files(root_dir)
    print("File map generated and saved to files_map.json.")
majora2007 commented 5 days ago

I was able to reproduce this on a unit test. From the filenames I've gotten, it seems to be related to the order of the () and things like C tricking the regex for one of the patterns Kavita is coded against.

image

Scan Library Parses as ( - Manga.json