TagStudioDev / TagStudio

A User-Focused Photo & File Management System
GNU General Public License v3.0
4.76k stars 358 forks source link

[Bug]: Library sometimes deleted when application closed #343

Open Nividica opened 1 month ago

Nividica commented 1 month ago

Checklist

TagStudio Version

Alpha v9.3.2

Operating System & Version

Windows 10

Description

Occasionally when closing Tag Studio, via the X button, it will wipe the ts_library.json file, leaving it with zero bytes, despite the logs claiming the library was saved.

I spent some time fiddling around with the code and stumbled upon a potential solution: Move the call to self.to_json() outside of the with open( block in the save_library_to_disk function.

Example

        # If `to_json` is called in the `with` block, the file will sometimes fail to write
        jsonLibrary = self.to_json()

        with open(
            self.library_dir / TS_FOLDER_NAME / filename, "w", encoding="utf-8"
        ) as outfile:
            outfile.flush()
            ujson.dump(
                jsonLibrary,
                outfile,
                ensure_ascii=False,
                escape_forward_slashes=False,
            )

Note that having jsonLibrary = self.to_json() inside the with open( block did not fix the issue.

I stumbled into this when I was logging the size of the jsonLibrary to see if it was actually being populated, and comparing the sizes at different places in the function. (It was consistently the same size, regardless of if it wrote the bytes to disk or not.)

With that change in place I was able to go through over a dozen open/close cycles and it saves the library each time. Where previously it would fail to save by the time I got to close number 5

I can't pretend to understand why this seems to have solved the issue, so I can not guarantee the robustness of the proposed fix.

Other Things I Tried

Work Around

During my testing I discovered if I manually save the library then immediately close the app, the bug never seemed to happen.

Additional Info

Expected Behavior

Data should be written to the library file on exit

Steps to Reproduce

  1. Open Tag Studio
  2. Create a new library
  3. Manually save
  4. Note the size of the save file (mine is ~340KB)
  5. Close Tag Studio
  6. Open Tag Studio
  7. Wait for it to load the library
  8. Close Tag Studio
  9. Check the size of the save file
  10. If the save file is not 0 bytes, repeat steps 6-9

Logs

Log output when it fails to save

Saving Library...
[LIBRARY] Saving Library to Disk...
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Library saved to disk in 0.008 seconds
[SHUTDOWN] Ending Thumbnail Threads...

Log output when it successfully saves

Saving Library...
[LIBRARY] Saving Library to Disk...
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Library saved to disk in 0.031 seconds
[SHUTDOWN] Ending Thumbnail Threads...
Nividica commented 1 month ago

I've continued to try and track down the root cause of the issue, and it seems like it is a timing issue. Potentially one outside of the app.

Starting with the original code, I added a print for the file path. Then if I print the size of the file before and after a call to sleep, they disagree with one another

jsonPath = self.library_dir / TS_FOLDER_NAME / filename
print(f"[LIBRARY] Save path: {jsonPath}")
with open(
    jsonPath, "w", encoding="utf-8"
) as outfile:
    outfile.flush()
    ujson.dump(
        self.to_json(),
        outfile,
        ensure_ascii=False,
        escape_forward_slashes=False,
    )
    # , indent=4 <-- How to prettyprint dump
print(f"[LIBRARY] Pre-sleep Size: {jsonPath.stat().st_size} bytes")
time.sleep(0.1)
print(f"[LIBRARY] Post-sleep Size: {jsonPath.stat().st_size} bytes")

On my 3rd close, the bug happened:

[LIBRARY] Saving Library to Disk..
[LIBRARY] Save path: E:\Screenshots\.TagStudio\ts_library.json
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Pre-sleep Size: 348937 bytes
[LIBRARY] Post-sleep Size: 0 bytes
[LIBRARY] Library saved to disk in 0.140 seconds

Then I added a call to sleep just before ujson.dump, and the bug magically goes away.

with open(
    jsonPath, "w", encoding="utf-8"
) as outfile:
    outfile.flush()
    time.sleep(0.1)
    ujson.dump(
        self.to_json(),
        outfile,
        ensure_ascii=False,
        escape_forward_slashes=False,
    )
    # , indent=4 <-- How to prettyprint dump
print(f"[LIBRARY] Pre-sleep Size: {jsonPath.stat().st_size} bytes")
time.sleep(0.1)
print(f"[LIBRARY] Post-sleep Size: {jsonPath.stat().st_size} bytes")
[LIBRARY] Saving Library to Disk..
[LIBRARY] Save path: E:\Screenshots\.TagStudio\ts_library.json
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Pre-sleep Size: 348937 bytes
[LIBRARY] Post-sleep Size: 348937 bytes
[LIBRARY] Library saved to disk in 0.227 seconds

Ran this test 10 times, and no bug.

As a final set of tests I went back to the original code with just the sleep inside the with-block

with open(
    self.library_dir / TS_FOLDER_NAME / filename, "w", encoding="utf-8"
) as outfile:
    outfile.flush()
    time.sleep(0.1)
    ujson.dump(
        self.to_json(),
        outfile,
        ensure_ascii=False,
        escape_forward_slashes=False,
    )
    # , indent=4 <-- How to prettyprint dump

Ran this test 10 times, and no bug.

So there seems to be two independent "fixes" for the bug

  1. Move self.to_json() outside of the with-block
  2. Add time.sleep(0.1) just before ujson.dump

I can't wrap my head around what is going on here. Option 2 would suggest there isn't enough time between opening the file and writing to the file. Why would there need to be an artificial wait added between open and write?

If I drastically increase the sleep time for Option 2 to several seconds I can see that when the file is opened for writing the existing contents are, expectedly, cleared with a file size of 0, the data is saved, and the file size goes to ~340K

But none of that explains why Option 1 would have any effect at all, and yet I can't repro the bug when self.to_json() is outside of the with-block.

If the source of the bug really is some kind of race condition, then I'm starting to wonder if a true fix for this would be to save a new library file, delete the old library file, then rename the new file.

Nividica commented 1 month ago

I ran with the temp file idea and have not run into the bug again

        tempJsonPath = self.library_dir / TS_FOLDER_NAME / "temp.json"
        jsonPath = self.library_dir / TS_FOLDER_NAME / filename

        # Delete temp file if exists
        tempJsonPath.unlink(missing_ok=True)

        # Write to a temp file
        with open(
            tempJsonPath, "w", encoding="utf-8"
        ) as outfile:
            ujson.dump(
                self.to_json(),
                outfile,
                ensure_ascii=False,
                escape_forward_slashes=False,
            )

        # Delete existing libary
        jsonPath.unlink(missing_ok=True)

        # Rename temp file
        tempJsonPath.rename(jsonPath)

In my opinion, of the three solutions, this one feels like the winner.

Pros

thrynae commented 2 weeks ago

Just a minor note (which you will probably already think of yourself: there will at some point be a user that decides to name their database temp, at which point you will have a name collision. Perhaps the name would be a better choice? Still not guaranteed to be unique, but if you want a stable name, you need to stop at some point. jsonPath = self.library_dir / TS_FOLDER_NAME / filename+".temp.json" Or compute a hash of the file name and use that (+".json") as the temp file name.

Leseratte10 commented 2 days ago

Also, looking at the code, if the application crashes between the "unlink" and the "rename" call, there would again be data loss (unless, as suggested, code will be added to recover from the temp file in that case).

Maybe instead of unlink and rename (which first deletes the database and then renames the temp one to the real database name) it'd be better to use os.replace.

That way you could call os.replace(temp_db, actual_db) without first deleting the actual database file, and the OS will guarantee (as of Python 3.3) that the move will either fail entirely (keeping the old DB) or succeed entirely (and you'll have the new file saved). That means that there's no need for extra code to recover from a temp file - the database will always be there no matter when the process was interrupted.

Also, at least to me it looks like the code currently in save_library_to_disk could fail on any OS, not just Windows. If the "open" call succeeds (and truncates the existing file) but then the json dump calls fail for whatever reason, the file will probably be corrupted.