jellyfin / jellyfin

The Free Software Media System
https://jellyfin.org
GNU General Public License v2.0
35.5k stars 3.22k forks source link

nfo saver makes unneccesary disk writes during library scan #12978

Open srosorcxisto opened 3 weeks ago

srosorcxisto commented 3 weeks ago

This issue respects the following points:

Description of the bug

When re-scanning a library using the .nfo saver, all .nfo files are updated even if there were no changes.

While mostly benign, this does result in unnecessary disk writes and cause some backup software and software-based raid systems (like SnapRAID) to treat the file as new because of the changed timestamp. I am not sure if this is a bug for feature request, so apologies if it is misfiled.

I believe this operation is being handled by SaveToFileAsync () on line 204 of jellyfin/MediaBrowser.XbmcMetadata/Savers/BaseNfoSaver.cs

I am not a C# programmer, nor am I familiar with the codebase and am therefore uncomfortable submitting a pull request on my own, but believe that this could be improved without noticeably impacting scan performance by adding a simple CRC32 hash to compare the stream with the existing file before writing.

Here is my attempt at a patch if helpful, I am not confident that it will work as written, but hate pointing out problems without at least suggesting a solution:

         private async Task SaveToFileAsync(Stream stream, string path)
         {
             var directory = Path.GetDirectoryName(path) ?? throw new ArgumentException($"Provided path ({path}) is not valid.", nameof(path));
             Directory.CreateDirectory(directory);
+       // Compare metadata hashes before proceeding
+       if (File.Exists(path))
+       {
+           byte[] existingFileHash;
+           using (var existingFileStream = File.OpenRead(path))
+           {
+               existingFileHash = new Crc32().ComputeHash(existingFileStream);
+           }
+
+           byte[] newContentHash;
+           using (var crc32 = new Crc32())
+           {
+               newContentHash = crc32.ComputeHash(stream);
+           }
+
+           stream.Position = 0;
+
+           if (existingFileHash.SequenceEqual(newContentHash))
+           {
+               return; // Don't save since .nfo is unchanged
+           }
+       }
+
            // On Windows, saving the file will fail if the file is hidden or readonly
            FileSystem.SetAttributes(path, false, false);

            var fileStreamOptions = new FileStreamOptions()
            {
                Mode = FileMode.Create,
                Access = FileAccess.Write,
                Share = FileShare.None,
                PreallocationSize = stream.Length,
                Options = FileOptions.Asynchronous
            };

            var filestream = new FileStream(path, fileStreamOptions);
            await using (filestream.ConfigureAwait(false))
            {
                await stream.CopyToAsync(filestream).ConfigureAwait(false);
            }

            if (ConfigurationManager.Configuration.SaveMetadataHidden)
            {
                SetHidden(path, true);
            }
        }

Reproduction steps

  1. Monitor the filesystem for changes (for example using auditd or inotifywait on Linux). Alternatively, just look at the timestamps for .nfo files before and after a library scan.
  2. Ensure that .nfo saver is for the library and that it has previously saved files
  3. Wait for a scheduled library scan or manually kickoff the job
  4. Observe that all .nfo get re-saved with new timestamps

What is the current bug behavior?

After each library scan with nfo saver on, all .nfo files are recreated regardless of changes to the metadata.

What is the expected correct behavior?

That only new and changed .nfo files are added to the filesystem.

Jellyfin Server version

10.10.0+

Specify commit id

No response

Specify unstable release number

No response

Specify version number

No response

Specify the build version

10.10.1

Environment

Jellyfin logs

None relevant

FFmpeg logs

None relevant

Client / Browser logs

None relevant

Relevant screenshots or videos

None relevant

Additional information

felix920506 commented 3 weeks ago

Please submit a PR if you have a fix.

srosorcxisto commented 3 weeks ago

I have the outline of a fix, but am not familiar enough with C# to test or fully implement in a PR.