gotson / komga

Media server for comics/mangas/BDs/magazines/eBooks with API, OPDS and Kobo Sync support
https://komga.org
MIT License
4.04k stars 240 forks source link

[Bug] ALL BOOK TAGS LOST after power-loss #390

Closed VegethB closed 3 years ago

VegethB commented 3 years ago

Komga environment

Describe the bug

I start Komga after a power loss and ALL tags have disappeared from the books. Furthermore, the tag list is as if it had been brought back to the very first time (with only 10 15 selectable tags). If I add new tags they are NOT added and the selectable ones have been moved from the "genre" list to the normal Tag list. HDD where Komga resides (.sqlite etc) has NO problems.

Log file

https://mega.nz/folder/Yp90lCoD#_blS-pbWggrMv-Bhp5gkGg

gotson commented 3 years ago

Your logs don't show anything abnormal.

Is the drive where you store your libraries a network drive, or an external drive?

VegethB commented 3 years ago

Is the drive where you store your libraries a network drive, or an external drive?

Internal Sata 2.5". The problem is that if it were HDD I would have problems all over the server (that disk holds all the server and app data to get the server going ... Ex: Plex, Emby, Sonarr, Minecraft, IIS, WebMail etc,) if it was broken or with bad sectors I would immediately notice the problem (I would be told by those who use these services, as well as the raid software would immediately report the damage). The only hypothesis from a person who does not know anything about it ... While he was managing the tags table, the power went out and being seen as corrupt it was reset (opening the .sqlite the book_tags table is empty ... in fact there is no a tags in my library). What I don't understand is why instead all the saved tags have disappeared and some have ended up in the list of actual tags (when they are tags for genres). I had also activated the backup but the folder is empty (I probably had to remove ""). The fact is that now I do not save the tags and that in any case regardless ... I have to redo all the tags of each book from scratch (I have not lost only tags, also the descriptions, been in progress / concluded. In practice, it is like if I had put new books without metadata.If the db had been reset I would no longer have users and libraries, things that are the only ones left together in the collections.

Config: `komga: libraries-scan-cron: "* */15 * * * ?" #periodic scan every 15 minutes `#` libraries-scan-cron: "-" #disable periodic scan libraries-scan-startup: false #scan libraries at startup libraries-scan-directory-exclusions: #patterns to exclude directories from the scan - "#recycle" #synology NAS recycle bin - "@eaDir" #synology NAS index/metadata folders remember-me: key: changeMe! #required to activate the remember-me auto-login via cookies validity: 2592000 #validity of the cookie in seconds, here 1 month database: file: B:\SERVER\Komga\DB\database.sqlite database-backup: path: B:\SERVER\Komga\BK\database-backup.zip schedule: 0 0 */6 * * ? #every 6 hours enabled: true startup: true server: port: 🙃` ![image](https://user-images.githubusercontent.com/17975351/105579957-d4619600-5d89-11eb-8649-5e4c1d21b6d6.png)

Edit:

Restarting now saves me the new tags ... in the end I still lost descriptions, tags and read statuses. More old logs: https://mega.nz/folder/YotSnZ7C#MZ1crLnVInPurRWJ1FtiVA

gotson commented 3 years ago

To come back to your initial problem :

In addition, the backup entry in the configuration is deprecated since v0. 48.0. If you need backups you can use whatever software you like to perform the backups of the database.

I understand that you know have a normal behavior, even though you lost some of the data you added. There's nothing that can be done about that, unless you did some backups of the database by yourself.

VegethB commented 3 years ago
  • logs don't show any issues, so there's no way to know what happened to your book tags

But I saw the logs and they are full of spam from the "delete library of files no longer present" (that should be reduced or removed from the info, for me it should be redone and put in a weekly scheduler where every week on Sunday from 3 to 7 ago start all the tasks to clean / optimize the database, for example. As for plex, the management tasks must not be spam every 3 hours but periodically every 7 or as many days as you like, settable from the config). This is why I said before that in my opinion he dropped / reset the table in use with the power-loss. But checking the DB some metadata (only 4 6 out of 320 books) remained (such as author / description) as if the DB had regressed the first few times I used Komga with those 20 30 manga in the library. In short, something happened but we are not understanding how it happened 😵🙃😥.

  • the tag list is generated on the fly from tags on the items (books or series). So if no books have tags anymore, the list of tags will subsequently only show existing tags.
DB: ![image](https://user-images.githubusercontent.com/17975351/105608098-3c23db00-5da2-11eb-8292-e050a3397c96.png) ![image](https://user-images.githubusercontent.com/17975351/105608145-873dee00-5da2-11eb-8ff7-9d5a3d63e529.png) ![image](https://user-images.githubusercontent.com/17975351/105608163-9f157200-5da2-11eb-8634-ddf3e51cd77e.png) ![image](https://user-images.githubusercontent.com/17975351/105608169-a76dad00-5da2-11eb-9e47-4e16b84b8f4f.png) ![image](https://user-images.githubusercontent.com/17975351/105608178-b2c0d880-5da2-11eb-8131-166c8ba42b61.png) ### There: ![image](https://user-images.githubusercontent.com/17975351/105608197-ce2be380-5da2-11eb-981c-11f8dd8b7107.png) They are considered as "series" for the DB. I noticed the loss of metadata from just what is now marked as "ENDED" (last 4 additions). In fact, what is marked is what I edited and discovered the problem (the only one). I do not ask to have the metadata back (it is evident that they have been lost forever and without bk you do nothing) but at least to understand how this could happen and how it happened that the few remaining tags went to mix. ![image](https://user-images.githubusercontent.com/17975351/105608312-91142100-5da3-11eb-84f2-7ed8a7cdc861.png) ![image](https://user-images.githubusercontent.com/17975351/105608327-a6894b00-5da3-11eb-9c08-9d47849c26a4.png) there must have been 70 90 tags here ... why were only these saved? (those below the selected one are new). I don't understand why the server has to delete tags if they are not associated with anything. It's not the thumbnails that just waste space, if in the future I add something with that tag it would bother me to write it down by hand again. Also the function that fills the spam logs with: ![image](https://user-images.githubusercontent.com/17975351/105608651-84dc9380-5da4-11eb-943d-f0246ee30559.png) it does not make sense.Because if I move things, he instantly deletes any associated metadata. It would be enough to do as on plex, the file / book / series enters a state of "trash" where if the files do not reappear within the scheduled task (every evening at 3 until 7: check the library and clean it if something is not there more. For example) then any metadata is deleted from the database. Also this system should solve the problem if you change the root path to a library. This is because the scanner will just need to change the root_path associated with all ids that used that root_path. Ex: Virtualize the FS with IDs: `Root_path1` = `C:\myComics\` (MyComics Library 1) All files inside `Root_path1` will be saved like this: "`\One Piece\Volume1.zip`" or "`\One Piece\Chapter1.zip`" (since you can't use subfolders it should be even easier to manage this way) with unique random ID that you refers to the One Piece series and unique random ID which refers to the Volume1.zip file. Instead of saving them with "`C:\MyComics\OnePiece\Volume1.zip`" that if you change `root_path` you cannot update it automatically but only by modifying each entry of the db yourself. This causes the server to join the path of `root_path1` with all the paths associated with that path. At this point the metadata for OnePiece volume1.zip are associated with the ID of "OnePiece" and ID of "Volume1.zip" which have not changed and you won't lose anything. If one of the two IDs changes or is deleted then the associated metadata are also deleted. If I changed the name to "volume1.zip" in "onePiece" this would lose all metadata by force of things (it is not a TV series that you know officially has 12 ep and therefore in the DB you create 12 independent IDs which will then be completed by the the fact that the file has "S01E01" and therefore regardless of the file name and path, for the server it will be that file "E01" to tell it that the new path / name of the file is to be updated to that specific unique ID for ep 1). In the sense: If manga also follow the renaming rule: "`Series Name - Volume Number + Chapter Number - What you want / release group name / language / any other useful parameter - whatever, eg chapter title / volume`" = "`One Piece 1997 - v98c986 - Translation Team1 - ENG - Luffy vs All part 1.zip`" For the server, when creating the One Piece series, it will also create unique ID which corresponds to the 986 chapter of that series. That ID is chapter 986 for the One Piece series, even if I delete the chapter, that id remains in the db because it is chapter 986. Simply, no longer finding the physical file, it ends up in a "trash" state where if when it starts the maintenance / cleaning schedule every week from 3 to 7 (because set by me in the komga server), then everything concerning it (including ID) is removed from the DB. Ex: `Root_path1` ID = 5 (`C:\MyComics\`) `One Piece` ID = 20 (`\One Piece\`) ID `Chapter 986 associated with ID 20` = 9999 (`One Piece - v98c986 - anything.zip`) `5 + 20 + 9999` = `C:\MyComics\` + `One Piece\` + `One Piece - v98c986 - anything.zip` what happens if i change `root_path1`? Nothing impossible: leave it `5 + 20 + 9999` where, however, the new path for `root_path1` will be: "`D:\Manga\`" but ID is 5 and remains 5. This is the mystery behind the mystical way in which Plex / Emby can allow you to change the root_path of a library or the filename without losing the associated metadata.
  • you mention tags moving from Genre to Tags, but Genre is a Series metadata, and you mentioned books. Can you clarify that part?

Simply some tags of the genres (action, manga, fantasy) ended up in the list of tags (gore, only male protagonist, swordplay, magic etc.) but by now I have solved it by cleaning the .sqlite by hand and starting practically from 0.

I want to specify one thing: It does not want to be a comment of insults / outbursts but only of clarification and possible feature request.

gotson commented 3 years ago

But I saw the logs and they are full of spam from the "delete library of files no longer present"

Indeed, that was in your "old" log. Next time try to pinpoint logs that are meaningful, i don't necessarily have the time to skim through thousands of lines of log.

I think there is a regression introduced from that change, where before the scanner would throw an exception if the library path was not reachable, while now it swallows the exception and returns 0 Series, hence deleting the whole library.

I have created #392 to track that problem.

As to what happened, clearly your folders where not accessible when the scan happened:

2021-01-16 16:45:06.099  WARN 16224 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.FileSystemScanner   : Could not access: G:\folder\IMG\Manga-Hentai\RaMa

The deletion is a normal process, but there is a feature request for a trash bin, to handle such cases. See #217.

Why not everything was wiped ? In the logs i can see only 6 of your libraries were not accessible, and wiped. Those libraries seemed to be on your G:\ drive. The ones on your I:\ and H:\ drives were accessible, and not wiped. That explains why some data was still there.

I don't understand why the server has to delete tags if they are not associated with anything

Tags are not stored in a tags table. They are denormalized and stored per book or series. When you delete a book or series, the associated tags with that book/series are deleted, as any other metadata, and thumbnail.

it does not make sense.Because if I move things, he instantly deletes any associated metadata.

See #217.

It does not want to be a comment of insults / outbursts but only of clarification and possible feature request.

No problem, thanks for clarifying. English is not my primary language, and it doesn't seem to be yours either. Hope i managed to clarify what happened.

VegethB commented 3 years ago

Indeed, that was in your "old" log. Next time try to pinpoint logs that are meaningful,

ALL logs are like this. That's why I sent them all. I too lost 4 hours reading them all on visual studio. What I wanted to point out is that that "scann" is spammed to the point of filling 95% of the logs of that text. I think it is better to find a way to decrease the output of that spam. Example if it finds no problems (0 changes) then it is not written to the output. Obviously thanks to visual studio it was easy for me to tell him to delete all that spam from the logs. That's all.

Tags are not stored in a tags table. They are denormalized and stored per book or series. When you delete a book or series, the associated tags with that book/series are deleted, as any other metadata, and thumbnail.

And that's a problem though ... Why is there no tags table? I don't understand the disadvantages compared to putting tags in a random book that if one day I delete, I lose that tag too. In fact with problem #392 here is the devastating combo: 70-90 lost tags.

In this case I open a new issue as a feature request to ensure that the tags are saved in a dedicated table and that table is used by komga when it has to suggest the tags. I assume the advantage of the current system (I haven't tried, so I might be wrong) is that you show all tags used only in that library (so you don't see tags used in other libraries). Otherwise, honestly, I don't understand the benefit.

One solution is to create 2 tables for the 2 Genres: Table1 Book: Table1-1 Genres Table1-2 Tags

Table2 Series: Table1-1 Genres Table1-2 Tags

and then populate the tables when I add the tags in the various sections in the meta. Example: I add Fantasy and action in the genres of a series and swordplay, male protagonist in its tags: These tags are added with a unique ID (which will be used to recall them in the future and associated with the metadata of the books / series) to their respective tables.

Table 1-1 Genres (in Table2 Series table) ID: | Name 55 | Fantasy 56 | action

and so when in a new book / series I go into the genre tags, if I click on the suggestion "Fantasy", that metadata is added in the tags_generi category ID 55 (which komga will then display as Fantasy).

It's a guess, surely you will know how to use a better method.

Anyway, thank you for the support and patience to keep up with me 👍