monstermuffin / mergerfs-cache-mover

Python script for moving files on a cached disk to a backing pool with mergerFS
GNU General Public License v3.0
15 stars 5 forks source link

[Bug] Couple issues to fix #5

Closed tigattack closed 1 month ago

tigattack commented 1 month ago

As with #4 I'm happy to do these myself, just noting so that at least one of us will remember.

Ordered by priority:

monstermuffin commented 1 month ago

On the todo for sure.

  1. Yeah, this was accidental as I just implemented the Python method for moving files, good catch.

  2. Indeed has been a problem for a while, has to do with the threading. There are several ways to fix this that I've been thinking about:

    • Implement a rate limiter for logging cache usage, perhaps only logging every X seconds or after Y files processed.
    • Use a separate thread for logging that checks and logs the cache usage at regular intervals, independent of the file gathering process.
    • Only log cache usage at the start and end of the file gathering process, with perhaps one or two intermediate checks.

None of them really seem like good options to me, open to your thoughts?

On the topic of this issue though, there is also an 'issue' wherein the active threads will spam the log if they have finished their file moves and there are none left, but other threads are still active. This spams Reached target percentage in the log.

Again, I'm not entirely sure how to fix this right now, the current train of thought is to trigger a global flag once the first thread reaches the target, which signals the other threads once they are complete, killing them. This should work but I haven't had a chance to implement and test.

  1. Will add a check for this, cheers.
tigattack commented 1 month ago

Re. logging cache usage, I'm not particularly experienced with threading and wouldn't trust my knowledge of best practices/common gotchas around this, but my instinct is to go for the second option (logging at intervals from a dedicated thread). I imagine there are better options, but nothing comes to mind right now without the source fresh in mind.

Re. logging Reached target percentage, a global, whilst likely not the perfect solution, sounds like it would do the job fine. I'd say probably go ahead with that and I can take a deeper look to see if there's a better way of handling it when I have some time to dedicate to this and get on with some other bits I had in mind (which, typically, I've now forgotten, but iirc I made some notes for myself).

monstermuffin commented 1 month ago

All these issues should be fixed in the latest dev commit cd09fb208596bfbcf97ad99b6d2020c1b100e3b1.

I'll await your confirmation 😄

tigattack commented 1 month ago

Looking good on all counts, ty! ❤️