maforget / ComicRackCE

A Community Edition for the legendary Comic Book Manager ComicRack. ComicRack is back from the dead.
GNU General Public License v2.0
213 stars 20 forks source link

Option: Book files are updated automatically --> Superfluous "update non-DB files" dialog #45

Closed flaser closed 2 months ago

flaser commented 4 months ago

Describe the bug When the option Book files are updated automatically is enabled, all changes are automatically committed to the files. Thus the Updates non-DB files Dialog is superfluous as changes are already committed. Worse, it may mislead the User, giving the impression that the option normally doesn't affect Non-DB files.

This behavior was discovered while investigating XML deserializaiton issues. If deserializaiton fails, ComicInfo is marked as dirty. Together with the option above CR will flush the ComicInfo.xml. Behaviour was triggered by intentionally mangling XML opening/closing tags so they're unpaired to check error handling.

~CR will always updates malformed ComicInfo.xml, even if the user chooses not update files outside the DB.~

Exact Steps to Reproduce

  1. Enabled option: Book files are updated automatically
  2. Make changes to Non-DB files
  3. (Optional - check with 7-zip that ComicInfo.xml is updated whenever a file's ComicInfo is marked dirty - whether in-/non-DB)
  4. Close CR, answer Updates non-DB files Dialog Yes or No
  5. Check with 7-zip that ComicInfo.xml was already updated earlier (older time-stamp) regardless Dialog choice

Version/Commit (check the about page, next to the version, for the string between brackets):

maforget commented 4 months ago

Are those malformed tags, Unknown tags are does it do it will regular tags?

flaser commented 4 months ago

It happens with both as far as I can tell, e.g. it seems any unclosed tag can break parsing.

maforget commented 4 months ago

How can you tell it updates the file, if you haven't done any changes to it? Is it not just the same as it was?

flaser commented 4 months ago

Here are my test steps:

  1. I open the CBZ file in 7-zip and edit it's ComicInfo.xml
    • I rename an opening or closing tag
    • E.g. <Writer> ... <Writer> ==> <Writer> ... <Write>
  2. I don't add the CBZ file to the DB, merely open the file with CR. (Double-click in explorer)
  3. I try and open the "Info" tab and can confirm no metadata is there (only stuff guessed from file-name).
  4. I make sure not to click Apply (haven't checked yet whether that triggers update), but click Cancel instead.
  5. I close CR
  6. I open the CBZ file with 7-zip and check the ComicInfo.xml.

Every time parsing fails - due to mismatch of XML tags - CR ends up flushing the file and replacing its contents with whatever it could gleam from the filename, even when I explicitly choose not to update non DB files.

I've just tested and can confirm it happens with both Parsed, and Unknown tags.

flaser commented 4 months ago

UPDATE:

maforget commented 4 months ago

What is happening is that the deserialization catches an error so it throws an exception so it just returns a null and then the auto discovered data makes that there was some change to the ComicInfo setting it to Dirty and prompting the message. You also need to have the Setting to Update files to on.

What I don't get is that the file isn't updated when you press No. It stays like it was malformed.

So the first part about the malformed XML, Maybe there is some kind of option we can pass to it to maybe ignore it. If it is possible I don't know if it is a good idea to allow that. The XmlUtility system is used by the Database and settings too. https://github.com/maforget/ComicRackCE/blob/8da6126c30fbee5039035aafed2b291992364940/ComicRack.Engine/ComicInfo.cs#L1461-L1471

But the second part about it being updated even when telling know I can't reproduce, the file is untouched and the code goes over the part about writing to the file. Now if someone would update multiple file and say yes the malformed on would be lost, It might be inevitable. https://github.com/maforget/ComicRackCE/blob/8da6126c30fbee5039035aafed2b291992364940/ComicRack/MainForm.cs#L1079-L1094

Maybe you have something that locked the file in the failed state? The fact you mentionned a temp file next to it, might be a cue.

flaser commented 4 months ago

I tried to isolate the issue using VS debugger. This is the situation when I navigated to the file in the folder view, right after Deserialize was called on the file.

Already, CR has created a .tmp file with a stipped ComicInfo.xml inside.

Screenshot 2024-02-22 122243

Inside CR, the file is marked as having changes to commit to file:

Screenshot 2024-02-22 122400

Closing the app in debug mode did not overwrite the file, but left the .tmp one behind. (Ah, is this a Heisenbug?) It does do this in release though.


Hmm... maybe it's this setting that's causing issues? Book files are updated automatically

Screenshot 2024-02-22 122550

YES!

With the option disabled the file is not updated. I think what's happening here is this option will force-update files no matter what even while CR is running. (I think the presence of .tmp file and whether it replaces the original or not is down to timeout issues I may be causing in debug).

Therefore the Save.... not in database? dialog on exit is moot as the damage was already done. I think we need to clarify how the option Book files are updated automatically treats non-DB files:

  1. Don't automatically update them (chaged) - ask whether User wants to update (current behavior)
    • Would just need an extra check in the update logic to see if the file is in DB
  2. Update them like other files (current behavior) - don't ask the User whether to update them (changed)
    • Save.... not in database? dialog should be disabled based on the Book files are updated automatically option
  3. Split Book files are updated automatically option for DB and on-DB files
    • Most work (redesign winform, update logic checks, wire-up to UI, etc)
    • Is this a worthwhile User Story? (e.g. do people want to automatically update files not in their DB?)

I'm tempted to go with Option 1. as it seems to restore the intent behind the Dialog and and is less work.

flaser commented 4 months ago

It seems the setting is checked only here:

https://github.com/maforget/ComicRackCE/blob/e1b9a0ec3581952315126cea727c4d88b196dba7/ComicRack.Engine/QueueManager.cs#L487-L507

I'm trying to figure out whether it's better to do amend the 1st or the 2nd check.

flaser commented 4 months ago

OK, I'm starting to understand the behavior. I think the UI design is a bit inconsistent:

Book files are updated automatically ON:

Book files are updated automatically OFF:

Having thought about it more, I think it makes more sense to go with Option 2: disable the dialog if the option is ON. I've updated the Opening post to reflect my findings and clarify what I think the issue is.

I think failed deserialization behavior should still be addressed, but elsewhere. I moved the relevant data to this new issue: https://github.com/maforget/ComicRackCE/issues/47#issue-2149089206

maforget commented 4 months ago

Ok I will explain how the app is suppose to work, I don't believe there is any problem here:

Now the reason when you stop the debug it doesn't asks, is that it isn't the same as closing the program normally. It's like force closing it, so it never executes it's closing tasks, like saving the settings, db, etc.

So for files in the database:

For files NOT is the database:

The program asks "Save changed information for Books that are not in the database?" What it means is: You made changes to the data but these were not saved in the database and if I don't update these files your changes will be lost. AKA: I got data that might be lost since it wasn't saved anywhere else.

As for the AutoUpdateComicsFiles, I have never had it on, but based on your observations, what I believe happens is that it will create a temp file. Then maybe at regular interval, it will update the actual file. At the end it prompts to save, because it detects a book asDirty (unsaved changes). The piece of code you posted uses a delegate that means is that it's not executed right away but somewhere down the line by the QueueManager. And as you can see if the option has AutoUpdateComicsFiles then the file is updated and Dirty is turned off.

So when it asks you at the end, it only does if there are Dirty files, that is the place you should be checking. Why does it report the file as being dirty if it was updated? So if we merge your change in #48 all those that have changes and have the AutoUpdateComicsFiles option turned on will always loose the changes they made.

flaser commented 4 months ago

TL;DR:

those that have changes and have the AutoUpdateComicsFiles option turned on will always loose the changes they made.

It's actually the opposite:

What's wrong with the Save.... not in the database? modal dialog

Change #48


On DB vs Non-DB files

Yes, I do understand we only ever deal with the ComicInfo.xml and nothing else. However it's Modal Dialog that hints at different behavior for files not added to the library. You're correct that the primary purpose is to let the User know that for the latter their changes will be lost unless committed to the file.

The problem is that the behavior hinted by the Dialog is not honored if AutoUpdateComicsFiles is enabled.


On AutoUpdateComicsFiles

AutoUpdateComicsFiles does not create .tmp files. Instead ut does exactly what it says:

Now the reason when you stop the debug it doesn't asks, is that it isn't the same as closing the program normally. It's like force closing it, so it never executes it's closing tasks, like saving the settings, db, etc.

I actually properly closed CR every time, (e.g. I did not use the red square to stop execution) to trigger OnFormClosing. I was using break-points though, which halt execution and could easily cause time-outs: like 7-zip waiting to get a write handle on the CBZ file. (I assume created the updated archive is using a read handle).


On delegates

The piece of code you posted uses a delegate that means is that it's not executed right away but somewhere down the line by the QueueManager.

Now this one I have to argue: Delegate does not guarantee async execution.

It can be can be a means of registering a callback and callbacks invocation can be distinct from assignment. (E.g. the callback executes when the encapsulating function does, not when we assign the pointer). https://learn.microsoft.com/en-us/dotnet/standard/asynchronous-programming-patterns/calling-synchronous-methods-asynchronously

However, a lot of the time delegates are invoked synchronously when used as an alternative to a lambda expression: Unlike the older, C# one, this guide best explains : https://learn.microsoft.com/en-us/dotnet/standard/delegates-lambdas

Here's that's exactly what's happening, we supply the definition of an anonymous function that the AutomaticProgressDialog will invoke when it fires. Just because QueueManager.WriteInfoToFileWithCacheUpdate is invoked here (within this delegate) doesn't mean it will be executed at a later time. In fact stepping through execution in Debug shows that's it's being directly called.

maforget commented 4 months ago

I really don't want to argue and really don't have the time for that. here are the bullet points:


Also I didn't mean that the delegate is like async, what I mean is that it will me called when it's invoked. This is a QueueManager, so it queues updates. It can be done later. If you follow the execution path, it will sleep in a while loop for 1 second until the images pool is done, before invoking the delegate. And since it's a queue it can be done hundreds of files later.

I will give you an example I once created a scheduler that updated every day at a specific time, here the delegate (or lambda, they are basically the same) is ReloadData(), so at 15:05 that method is called. It isn't async, it just registers something to do at that time and the delegate is just what to do.

this.NewDailyScheduler(() => ReloadData(), 15, 05, DailyScheduler.Scheduler.EveryDay);

Also I rarely seen 7-zip .tmp file like that, when they do, the disappear almost instantly. If you have it open while CR is open it will lock the file and prevent any changes. Are you certain this isn't related to the dirty flag not being removed?

I think the proper test procedure would be: