maforget / ComicRackCE

A Community Edition for the legendary Comic Book Manager ComicRack. ComicRack is back from the dead.
GNU General Public License v2.0
213 stars 20 forks source link

Damaged/malformed XML contents flushed by CR w/o notice #47

Open flaser opened 4 months ago

flaser commented 4 months ago

Describe the bug Currently if XML deserialization fails, CR will attempt to re-popoulate metadata from the file-name alone. Since this counts as "new" metadata the ComicInfo is marked as dirty. Depending on settings, the User can then commit this to the file w/o being aware of the potential data-loss.

Exact Steps to Reproduce

  1. I open a CBZ file in 7-zip and edit it's ComicInfo.xml
    • I rename an opening or closing tag
    • E.g. <Writer> ... <Writer> ==> <Writer> ... <Write>
  2. In CR I open the file and check the "Info" tab and can confirm no metadata is there (only stuff guessed from file-name).
  3. I save the data to file in some fashion
  4. I open the CBZ file with 7-zip and check the ComicInfo.xml - it only has data gleamed from the filename

Version/Commit (check the about page, next to the version, for the string between brackets):

Additional context The User should be made explicitly aware of XML parsing issues. I propose the following behavior:

maforget commented 4 months ago

You will need to catch the correct exception, right now it only throws a InvalidOperationException not really descriptive. The Inner Exception is better. Maybe instead of returning a null, create an empty ComicInfo, with a property that states the Error Message and an IsError or something.

You also need to differentiate, between the other catch a level up that also returns null or if the file doesn't exists at all. Then probably catch the SetInfo to set something there also, so it doesn't just return null or set all the data. Then also the check if there is the info saved as a sidecar (.xml beside the file) or in the NTFS alternate stream. Then prevent any changes to the file for this exact reason , in all the place where files are written (On Update, On Exit, On Export, etc). Then it might just be for files not in the DB, because if they are then it would ask you to update the file and problem would be fixed. Wire all this up in the UI also to prevent any changes.

All that for an hypothetical malformed XML to not loose any changes. There are also big chances that any software will also discard it. It's malformed, we shouldn't jump through hoops to prevent some data from being lost.

Even if we use something like XmlSerializer.CanDeserialize, the ComicInfo would be null also, so back to the original problem.

It's a valid issue, so I will keep this up, but I see it as low priority, because the chances it happens are low and if it does you loose a little bit of data. You also have the NTFS as a backup in that case. I will personally not check it more than needed.