Kareadita / Kavita

Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.
http://www.kavitareader.com
GNU General Public License v3.0
5.89k stars 300 forks source link

Scan can't handle 2 series with {omnibus edition} edition difference #1513

Closed ocgineer closed 1 month ago

ocgineer commented 2 years ago

Describe the bug A clear and concise description of what the bug is.

Seems Kavita can't handle two of the same series anymore, with a {omnibus edition} difference that was added to the filename to denote an edition to allow both editions to co-exist in Kavita.

|- Magic Knight Rayearth {Omnibus Edition}
|    |- Magic Knight Rayearth {Omnibus Edition} v01 (2011) (Digital) (BlurPixel-Empire).cbz
|    |- Magic Knight Rayearth {Omnibus Edition} v02 (2012) (Digital) (BlurPixel-Empire).cbz
| 
|- Magic Knight Rayearth
     |- Magic Knight Rayearth v01 (2019) (Digital) (danke-Empire).cbz
     |- Magic Knight Rayearth v02 (2019) (Digital) (danke-Empire).cbz
     |- Magic Knight Rayearth v03 (2019) (Digital) (danke-Empire).cbz

After Scan Magic Knight Rayearth {Omnibus Edition} shows up as Magic Knight Rayearth instead of as expected Magic Knight Rayearth { Edition} similar how {Special Edition} works. Also, the 2 volumes have the wrong covers, from the non-omnibus edition!

Renamed this to Magic Knight Rayearth (Omnibus Edition) then did a scan to have the non-omnibus show up as Magic Knight Rayearth properly. I didn't do anything with the wrong covers yet on the omnibus edition.

After the scan, I tried refreshing the covers of the omnibus edition but Kavita promptly deleted the whole series!

Scanned the library again, it removed volume 3 under Magic Knight Rayearth and replaced v1 and v2 by with the 2 omnibus versions with wrong cover again (I check details to see it is the omnibus version by filename). Refresh Covers worked now as it still had the wrong covers.

So to work around this issue we have to add the {omnibus edition} first, scan, rename, add the other edition, and scan.

Expected behavior Both series can be added correctly at the same time before a scan, and the {omnibus edition} series name will still include the {edition} (or part of it if it contains a keyword such as omnibus like in this case), similar to using {Special Edition}.

Desktop (please complete the following information):

ocgineer commented 2 years ago

I also noticed other series that were already in the library, with a {edition} are duplicated as well. I have to remove the old one and rename the newly added one. If I remove the new one it just keeps showing back up each scan. In my case I have a series The Ghost in the Shell {HumbleBundle}, which is the only series in that library with the ghost in the shell and that i renamed to The Ghost in the Shell (HumbleBundle) where I noticed this behavior (the other ghost in the shell with the same name is in another library).

ocgineer commented 1 year ago

Just for more info, I have the issue also on Neon Genesis Evangelion, but this time I had them grouped. For this it ignored the non-{} completely and added the {} series without {} at all this time. Trying the work around described in the first post works somewhat, but covers are incorrect and it lists some books merged, but also still having separate series.

image

root
|- Neon Genesis Evangelion
    |- Neon Genesis Evangelion {Omnibus Edition}
    |    |- Neon Genesis Evangelion {Omnibus Edition} v01.cbz
    |    |- Neon Genesis Evangelion {Omnibus Edition} v02.cbz
    |
    |- Neon Genesis Evangelion
    |    |- Neon Genesis Evangelion v01.cbz
    |    |- Neon Genesis Evangelion v02.cbz
    |
    |- Neon Genesis Evangelion - other series
majora2007 commented 1 year ago

So I just tried this out and was not able to reproduce. image

What I did was:

|- BTOOOM! {Special Edition}
|    |- BTOOOM! {Special Edition} v01 (2011) (Digital) (BlurPixel-Empire).cbz
| 
|- BTOOOM!
     |- BTOOOM! v01 (2019) (Digital) (danke-Empire).cbz
     |- BTOOOM! v02 (2019) (Digital) (danke-Empire).cbz

I put BTOOM! {Special Edition} with a ComicInfo that has Series set to BTOOM! {Special Edition}.

Without a ComicInfo in the Special Edition, the Scanner still gave me 2 series: BTOOM! and BTOOOM! { Edition}.

ocgineer commented 1 year ago

very strange, hmm. I just did a new test, on a completely now library with only the following;

|- Magic Knight Rayearth {Omnibus Edition}
|    |- Magic Knight Rayearth {Omnibus Edition} v01 (2011) (Digital) (BlurPixel-Empire).cbr
|    |- Magic Knight Rayearth {Omnibus Edition} v02 (2012) (Digital) (BlurPixel-Empire).cbr
| 
|- Magic Knight Rayearth
     |- Magic Knight Rayearth v01 (2019) (Digital) (danke-Empire).cbz
     |- Magic Knight Rayearth v02 (2019) (Digital) (danke-Empire).cbz
     |- Magic Knight Rayearth v03 (2019) (Digital) (danke-Empire).cbz

(I noticed that the omnibus edition was cbr, not sure if that could do something). non of the files do have any comicinfo.xml

After adding the library and it scanned, this was the result.

image image image

{ Edition} not in the name, and the wrong covers again, and the non-edition is nowhere to be seen. I renamed the series in Kavita to Magic Knight Rayearth (Omnibus Edition), and then ran a library scan again.

image

Now it shows up correctly, but the covers on the omnibus version are still incorrect. Tried scanning the series, resulting in a scan error.

image

refresh covers worked to update to the correct covers. I can reproduce this every time 🤔

I removed both series and ran another scan, same result as the initial creation of the library. however this time i checked the database result before making any changes.

image image

As can be seen the path is correct, but localized name and names have {} completely stripped.

majora2007 commented 1 year ago

I have identified the underlying issue. The TrackSeries code uses normalization to strip out {} and perform merging with other series that might need to match (like those with LocalizedSeries tag). There isn't a logic change from new scan loop, but due to how we scan, this issue is surfacing.

In order to address this, I will likely need to use the Edition field (we parse this but I've never used it) to ensure that a merge doesn't happen.

tjarls commented 1 year ago

There is a dedicated processing for Omnibus Edition that strips it entirely as part of MangaEditionRegex. That'll strip any of these:

Omnibus
OmnibusEdition
Omnibus_Edition
Omnibus Edition

It should not have resulted into Magic Knight Rayearth { Edition} but in Magic Knight Rayearth {}. @ocgineer can you confirm the title?

Indeed the new normalization code and the new scan loop would result on those two series having the same normalized title and being in different top level directories, henced not mergeable and colliding.

Does it make sense to strip Omnibus Edition from the title, and subsequently have additional logic to parse that edition and use that as a differentiator? That sounds overengineered. We should discuss that use case.

Anyway, @ocgineer in the short term a workaround to get an aproximation of the old behaviour by using Magic Knight Rayearth {Omnibus-Edition} for example. The title would end up being Magic Knight Rayearth {-Edition}, a distinct series that can be have its displayed name changed to fit expectations.

ocgineer commented 1 year ago

I have double-checked and yes, I was confused with Special Edition as I have used it before, knowing it would strip Special but not Edition, so thinking that Omnibus Edition would be the same, thus stripping omnibus but leaving Edition. I have revised the original post to reflect this.

But then comes the question, why is "Omnibus Edition" full out being stripped while "Special Edition" is not? What would be the main differences and uses for this. So far, I have seen plenty of 'omnibus' in the filenames and it is right so to strip them. However, adding 'edition' would denote a different edition meant to be in the filename I would assume, same as 'Special Edition'. The bigger consideration, does 'Omnibus Edition' or 'Special Edition' need to be (partially) stripped in the first place (when having Edition appended)?

So far I have used {Omnibus Edition} on Magic Knight Rayearth and Neon Genesis Evangelion, as I have both editions. Now having added {omnibus edition} is something I myself did, for all purposes I could've used like the publisher, year or what others have used is x-in-1 Edition. I have tested the trick using Neon Genesis Evangelion {Omnibus-Edition} and it works, but this is something the user would need to know.

tjarls commented 1 year ago

I only meant that trick as a temporary workaround. Currently there's nothing protecting that trick to keep on working through changes. We could indeed envision a series of canonical recipes that would help users work around edge cases. If we decide to do so, we'd have to make sure we have unit tests validating those in addition to documentation. Another such recipe could be how to handle numbers in titles without using ComicInfo metadata. The one I currently use is to tack the number to a word. For example I have a series called Agent 212. In order to have it handled the way I want it, I have renamed both the files and directory to agent212. The directory needs also to be renamed because it is used as series name by special issues. Next I changed the displayed name of agent212 into Agent 212 and voilà.

majora2007 commented 1 year ago

Moving this over to discussion and out of the release since we have a path forward with changes in Parser code.

zeedif commented 1 year ago

I have a similar problem with epubs. 2 or more books from the same SERIES in different folders, which I would like to merge into one when scanning, but Kavita only shows me the SERIES with the epubs from a single folder and seems to ignore the others. For example, I have the Konosuba series translated by two different groups each in their respective folders:

|- Konosuba [Darkness]
|    |- [Darkness] Kono-Suba! - V05.epub
|    |- [Darkness] Kono-Suba! - V06.epub
|    |- [Darkness] Kono-Suba! - V07.epub
|    |- [Darkness] Kono-Suba! - V08.epub
|    |- [Darkness] Kono-Suba! - V09.epub
|    |- [Darkness] Kono-Suba! - V10.epub
|    |- [Darkness] Kono-Suba! - V11.epub
|    |- [Darkness] Kono-Suba! - V12.epub
| 
|- Konosuba [Gustang]
|    |- [Gustang] Kono-Suba! - V01.epub
|    |- [Gustang] Kono-Suba! - V02.epub
|    |- [Gustang] Kono-Suba! - V03.epub
|    |- [Gustang] Kono-Suba! - V04.epub

In its content.opf file all files have: <meta content="Konosuba: God's Blessing on This Wonderful World! [NL]" name="calibre:series" /> As you can see, it does not show me the first 4 volumes of the other folder and does not generate another series.

This is an image

majora2007 commented 1 year ago

@zeedif this is due to epub not using folders for parsing. You need to update the series name in each epubs metadata, so they differ.

zeedif commented 1 year ago

I also have a problem where I have different translations of the same series in the same folder. For example:

|- Goblin Slayer. Side Story II
|    |- Goblin slayer. Dai Katana - V01 [CanisLycaon].epub
|    |- Goblin slayer. Dai Katana - V01 [R-Fiction].epub
|    |- Goblin slayer. Dai Katana - V02 [CanisLycaon].epub
|    |- Goblin slayer. Dai Katana - V02 [R-Fiction].epub

In its content.opf file all files have: <meta content="Goblin Slayer: Side Story II - Dai Katana [NL]" name="calibre:series" /> & <meta id="serie" property="belongs-to-collection">Goblin Slayer: Side Story II - Dai Katana [NL]</meta>

It seems to detect them but it only shows me one edition of the epubs, is there a way to choose which one to read? This is an image This is an image

zeedif commented 1 year ago

@zeedif this is due to epub not using folders for parsing. You need to update the series name in each epubs metadata, so they differ.

I know it does not merge them by directory. The problem is that they are in different folders, but they all have the same series name in its epubs metadata and it doesn't merge them.

majora2007 commented 1 year ago

@zeedif please open another issue or join discord and I can support you.

GlassedSilver commented 1 year ago

Back to some manga CBZ concerns:

Sometimes single chapters receive some community love and get fan-colored or edited to let's say remove some excess bathing soap.

If a given chapter has two or more archives, would this also be part of this scope? The main concern here is that often times only fragments of a series get alt archives rendering a second series entry as Edition improper for obvious reasons. Being able to pick à la carte the specific release for ambiguous chapters on-demand seems like a good middle path and indicating that there are multiple releases possible on the chapter or volume (where ever applicable) could be very beneficial as well.

majora2007 commented 1 year ago

@GlassedSilver that is not what this issue is about. You would need to open a feature request for that on feats.kavitareader.com

DieselTech commented 1 month ago

This issue is for a very old version of Kavita that is out of date now. A lot of work has been done on the scanner since then and if you are still having this problem, please open a new issue.