Kareadita / Kavita

Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.
http://www.kavitareader.com
GNU General Public License v3.0
6.32k stars 328 forks source link

Scanning Library after update 6.0.0 broke some of my series into individual series for each tome. #1629

Closed Draknars closed 2 years ago

Draknars commented 2 years ago

Describe the bug After update 6.0.0, multiple of my series have split into individual series per tomes, intead of being under the same series. Those series were correctly scanned before the update, overall I went from ~600 series to more than 900, meaning that I have a lot of series that have split up.

To Reproduce Steps to reproduce the behavior:

  1. Updated to 6.0.0
  2. Perform a Scan Library

Expected behavior The expected behavior would be all tomes in one series, not 1 series per tome.

Screenshots An example of the issue, "In the Land of Leadale" is now 6 "series" of one volume image image

The file directory, as a Test I also tried to remove the " [Yen Press][Kobo] " at the end and rescanning, but the issue persist, those files were fine before 6.0.0 image

Other examples : image image

Server / Host :

Additional context Let me know if you need anything else

majora2007 commented 2 years ago

I have noticed this on mine as well and I believe it is due to https://github.com/Kareadita/Kavita/commit/a1c3f43656ffbaf884af204ed138bcb4bafb3953

In this, the change was added to make books more aligned with the set title, thus using the set series as is from the book, rather than the existing half way, where it grouped based on filename.

@tjarls If you want to jump in and give some thoughts on this. I'm torn as I believe auto-grouping for the user is beneficial, but do agree the changes made bring a much more flexible solution as we are using the metadata as is (and the user can correct by using series group with series index in Calibre).

Draknars commented 2 years ago

@majora2007 So is that intended behaviour? or is that because I am not doing something properly?

majora2007 commented 2 years ago

It's a change that we made last release and yes it is intended behavior.

The reason why I commented that, was because it might be nice to automatically apply the volume grouping, however technically, if you use the series and series index tags correctly, it'll do it for you.

I will be looking into this to see if there's a enhancement to be made, but I would highly suggest that you just use the series index as that is its intended purpose.

tjarls commented 2 years ago

I'm trying to understand how the series managed to get grouped before as I have not observed that issue. Is the epub setting a series name in the metadata? In that example, "In the Land of Leadale, Vol. 4" is a value from the epub metadata? Is there a way you could extract the metadata from that epub?

Draknars commented 2 years ago

It's a change that we made last release and yes it is intended behavior.

Understood, thanks for clarifying.

The reason why I commented that, was because it might be nice to automatically apply the volume grouping, however technically, if you use the series and series index tags correctly, it'll do it for you.

Which to be fair I didn't do, when you are close to a thousand’s series, doing all that is just big investment of time, as well as for all the future content added and at the time, I didn't see much value doing it. The only work I was doing is folder name and following the naming convention, which until now worked for me.

I will be looking into this to see if there's an enhancement to be made, but I would highly suggest that you just use the series index as that is its intended purpose.

Yeah it has been on my todo for a while...

I'm trying to understand how the series managed to get grouped before as I have not observed that issue. Is the epub setting a series name in the metadata? In that example, "In the Land of Leadale, Vol. 4" is a value from the epub metadata? Is there a way you could extract the metadata from that epub?

Sure thing, I can spin up a Calibre container and look at getting more info

tjarls commented 2 years ago

Sure thing, I can spin up a Calibre container and look at getting more info

Alternatively you can try opening the .epub file with a zip tool, extract the .opf file and open that file with a text editor. The section metadata in that file is the part of interest.

majora2007 commented 2 years ago

I'm trying to understand how the series managed to get grouped before as I have not observed that issue. Is the epub setting a series name in the metadata? In that example, "In the Land of Leadale, Vol. 4" is a value from the epub metadata? Is there a way you could extract the metadata from that epub?

Yes, the Title of the epubs are usually "In the Land of Leadale, Vol. 4". I believe before Kavita would also parse that from the filename and do some merging, where as now, I made it so it respects the metadata more (it may not actually be your code change thinking about it now).

I think this was my change from another issue where I changed how we merge data for epubs vs parsed info. I will investigate a potential enhancement here, but I do believe respecting the metadata is the correct path.

Draknars commented 2 years ago

Surething, I can spin up a Calibre container and look at getting more info

Alternatively you can try opening the .epub file with a zip tool, extract the .opf file and open that file with a text editor. The section metadata in that file is the part of interest.

In the Land of Leadale, Vol. 4 Ceez and Tenmaso aut Ceez and Tenmaso 1 Yen On © Ceez 2020 2021-11-17 2021-12-14T14:29:56Z en 9781975322199

Here you go, indeed this was much faster :)

I think this was my change from another issue where I changed how we merge data for epubs vs parsed info. I will investigate a potential enhancement here, but I do believe respecting the metadata is the correct path.

How out curiosity and for my own selfishness, is it a big burden consider having an setting option to let end-user decide wether to put more weight on the metadata vs the naming structure?

majora2007 commented 2 years ago

How out curiosity and for my own selfishness, is it a big burden consider having an setting option to let end-user decide wether to put more weight on the metadata vs the naming structure?

It's not something I am interested in developing or supporting. I want Kavita to be opinionated and less of config for every bell and whistle.

tjarls commented 2 years ago

Thanks @Draknars for providing that metadata. It does confirm that the file metadata does not provide any series information and the title is used instead. While it's a pain to update a large library, Calibre does offer a bulk edit functionality that somewhat ease such a task.

Draknars commented 2 years ago

@tjarls Thanks for the confirmation. I’m looking into calibre as well, I’ve used it in the past when I was actively using an e-ink reader. I’m just not a fan of this solution since by default it processes the default library location by author, meaning I will have to import my book separately, edit the metadata, the re-export to their original location, so I will research if there is other solution available to me

@majora2007 I thought as much, completely respect your decisions as it does make sense, just unfortunate in my scenario as it completely broke Kavita for me. For now, I ended doing a rollback to 5.6.

I hope you can find some enhancement, but I understand I have to get around my 5k books eventually

Thanks both of you for the assitance, much appreciated!

majora2007 commented 2 years ago

I ended up tagging this as a bug. I have implemented a fix that allow grouping without breaking existing metadata. Again, the correct solution is to use series and series_index, but this should hold you over.