mtkennerly / ludusavi

Backup tool for PC game saves
MIT License
2.33k stars 54 forks source link

Backup and restore does not record / recreate file modification times #132

Closed sluedecke closed 1 year ago

sluedecke commented 1 year ago

Ludusavi version

v0.13.1

Operating system

Linux

Installation method

Other

Description

Factorio offers the option to continue the game with the last saved game. It identifies it by looking at the file modification. Ludusavi does not record the file modification time during backup and does not set it during restore and Factorio offers a seemingly randomly selected savegame to continue.

E.g. that means

$ cd Factorio
$ ls saves
...
-rw-r----- 1 MYUSER MYUSER  6176539 Sep 30 07:38 atomic 01.zip
-rw-r--r-- 1 MYUSER MYUSER 22911720 Sep 30 07:29 free 02.zip
...
$ ludusavi backup Factorio
$ ludusavi restore Factorio
$ ls saves
...
-rw-r----- 1 MYUSER MYUSER  6176539 Sep 30 07:39 atomic 01.zip
-rw-r--r-- 1 MYUSER MYUSER 22911720 Sep 30 07:39 free 02.zip
...

Logs

No response

mtkennerly commented 1 year ago

Thanks for catching this. Interestingly, on Windows, it does preserve the times, for both simple and zip backups. (edit: re-tested, and it doesn't preserve for zip after all)

Would you like to try working on this? I'm happy to fix it, but it should be an isolated change, so it would be a good one if you're interested.

Relevant functions in layout.rs:

I'd like to see a new StrictPath::copy_to function (path.rs) that takes care of using the filetime crate automatically (e.g., get original file's modified time, copy to new location, use filetime to update the copy). So std::fs::copy(path1, path2) would become path1.copy_to(path2).

sluedecke commented 1 year ago

Thanks for the pointers to the code, I will give it a try.

sluedecke commented 1 year ago

I have a rust related question. I am trying several options - part of learning how to do it best - and my current approach is to include a field mtime in IndividualMappingFile which does not has the traits Default and Serialize:

#[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash, serde::Serialize, serde::Deserialize)]
pub struct IndividualMappingFile {
    pub hash: String,
    pub size: u64,
    pub mtime: FileTime,
}

cargo complains with:

-*- mode: cargo-process; default-directory: "~/Projekte/contributing/ludusavi/" -*-
Cargo-Process started at Fri Sep 30 10:58:30

/usr/bin/cargo build --manifest-path /home/MYUSER/Projekte/contributing/ludusavi/Cargo.toml  
   Compiling ludusavi v0.13.1 (/home/MYUSER/Projekte/contributing/ludusavi)
error[E0277]: the trait bound `FileTime: std::default::Default` is not satisfied
   --> src/layout.rs:229:5
    |
225 | #[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash, serde::Serialize, serde::Deserialize)]
    |                        ------- in this derive macro expansion
...
229 |     pub mtime: FileTime,
    |     ^^^^^^^^^^^^^^^^^^^ the trait `std::default::Default` is not implemented for `FileTime`
    |
    = note: this error originates in the derive macro `Default` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0277]: the trait bound `FileTime: Serialize` is not satisfied
    --> src/layout.rs:229:5
     |
229  |     pub mtime: FileTime,
     |     ^^^ the trait `Serialize` is not implemented for `FileTime`
     |
     = help: the following other types implement trait `Serialize`:
               &'a T
               &'a mut T
               ()
               (T0, T1)
               (T0, T1, T2)
               (T0, T1, T2, T3)
               (T0, T1, T2, T3, T4)
               (T0, T1, T2, T3, T4, T5)
             and 192 others
note: required by a bound in `cli::_::_serde::ser::SerializeStruct::serialize_field`
    --> /home/MYUSER/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.144/src/ser/mod.rs:1899:12
     |
1899 |         T: Serialize;
     |            ^^^^^^^^^ required by this bound in `cli::_::_serde::ser::SerializeStruct::serialize_field`

error[E0277]: the trait bound `FileTime: Deserialize<'_>` is not satisfied
    --> src/layout.rs:229:5
     |
229  |     pub mtime: FileTime,
     |     ^^^ the trait `Deserialize<'_>` is not implemented for `FileTime`
     |
     = help: the following other types implement trait `Deserialize<'de>`:
               &'a [u8]
               &'a std::path::Path
               &'a str
               ()
               (T0, T1)
               (T0, T1, T2)
               (T0, T1, T2, T3)
               (T0, T1, T2, T3, T4)
             and 219 others
note: required by a bound in `next_element`
    --> /home/MYUSER/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.144/src/de/mod.rs:1730:12
     |
1730 |         T: Deserialize<'de>,
     |            ^^^^^^^^^^^^^^^^ required by this bound in `next_element`

error[E0277]: the trait bound `FileTime: Deserialize<'_>` is not satisfied
    --> src/layout.rs:229:5
     |
229  |     pub mtime: FileTime,
     |     ^^^ the trait `Deserialize<'_>` is not implemented for `FileTime`
     |
     = help: the following other types implement trait `Deserialize<'de>`:
               &'a [u8]
               &'a std::path::Path
               &'a str
               ()
               (T0, T1)
               (T0, T1, T2)
               (T0, T1, T2, T3)
               (T0, T1, T2, T3, T4)
             and 219 others
note: required by a bound in `next_value`
    --> /home/MYUSER/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.144/src/de/mod.rs:1869:12
     |
1869 |         V: Deserialize<'de>,
     |            ^^^^^^^^^^^^^^^^ required by this bound in `next_value`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `ludusavi` due to 4 previous errors

Cargo-Process exited abnormally with code 101 at Fri Sep 30 10:58:33

Can I fix this without changing the FileTime package?

mtkennerly commented 1 year ago

For the record, I don't think we should store it in the mapping file since we can attach it to the backed up files themselves, but:

To do that, you would need to create an intermediate representation. For example, you could use FileTime::unix_seconds and store it as the raw i64. You could go further and convert the i64 into a chrono::DateTime (which does derive Serialize) by using NaiveDateTime::from_timestamp (and then going through some hoops to convert it to UTC for the sake of consistency).

sluedecke commented 1 year ago

I agree, just setting the modification times on the files within the simple backup or the zip file will be enough for the files.

Do you think the folder modification times should be backed up, too? My gut feeling tells me that this might be a good and consistent way of doing a backup / restore.

sluedecke commented 1 year ago

For the record:

#[derive(Clone, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash, serde::Serialize, serde::Deserialize)]
pub struct IndividualMappingFile {
    pub hash: String,
    pub size: u64,
    pub mtime: FileTime,
}

fails since FileTime does not implement Default, Serialize and Deserialize. I cannot impl Default for FileTime in ludusavi since it needs to be done in the crate where FileTime is defined. I could instead provide a impl Default for IndividualMappingFile instead of using a derived one.

mtkennerly commented 1 year ago

Do you think the folder modification times should be backed up, too? My gut feeling tells me that this might be a good and consistent way of doing a backup / restore.

Ludusavi is very file-oriented, like Git, so I think that might produce surprising results. For example, if you've reinstalled a game and let it make a new save folder, then Ludusavi restores files into the folder without modifying the folder itself or any other files that happen to be there. If you've modified a folder's permissions, Ludusavi won't change those either, so I think it would be a bit odd to start modifying only one specific aspect of the folder.

sluedecke commented 1 year ago

OK, I will leave out the folders then, as it would add extra entries to the manifest, too.

So far I managed to come up with a solution in https://github.com/sluedecke/ludusavi/tree/fix/132-backup-file-times, can you please have a look and give me some feedback?

I store the modification times in the mapping yaml. Although it can be done without since simple and zipped backups already have files with the proper modification time, it lets me get away without re-requesting metadata from the file system during backup / restore.

mtkennerly commented 1 year ago

Like I mentioned earlier, I don't think we should store it in mapping.yaml unless absolutely necessary. Have you tried storing it directly in the file/zip metadata? I don't think there should be any performance issues with doing that.

Ideally, I'd like to see a much smaller change. For example, when making a simple backup:

diff --git a/src/layout.rs b/src/layout.rs
index d4831a5..51c821a 100644
--- a/src/layout.rs
+++ b/src/layout.rs
@@ -949,7 +949,7 @@ impl GameLayout {
                 backup_info.failed_files.insert(file.clone());
                 continue;
             }
-            if let Err(e) = std::fs::copy(&file.path.interpret(), &target_file.interpret()) {
+            if let Err(e) = file.path.copy_to(&target_file) {
                 log::error!(
                     "[{}] unable to copy: {} -> {} | {e}",
                     self.mapping.name,

And then a new method on StrictPath that looks something like this:

    pub fn copy_to(&self, target: &StrictPath) -> Result<(), SomeError> {
        std::fs::copy(self.interpret(), target.interpret())?;
        let mtime = self.metadata()?.modified();
        filetime::set_file_mtime(target.interpret(), mtime)?;
        Ok(())
    }

Zips are more complicated, but here's an example. I used a lot of unwrap() since it's just a proof of concept, but that definitely should be cleaned up 😅

diff --git a/src/layout.rs b/src/layout.rs
index d4831a5..b9312b0 100644
--- a/src/layout.rs
+++ b/src/layout.rs
@@ -1029,6 +1029,18 @@ impl GameLayout {
             .large_file(true);

         'item: for file in &scan.found_files {
+            let mtime_raw = file.path.metadata().unwrap().modified().unwrap();
+            let mtime = chrono::DateTime::<chrono::Utc>::from(mtime_raw);
+            let mtime_zip = zip::DateTime::from_date_and_time(
+                mtime.year().try_into().unwrap(),
+                mtime.month().try_into().unwrap(),
+                mtime.day().try_into().unwrap(),
+                mtime.hour().try_into().unwrap(),
+                mtime.minute().try_into().unwrap(),
+                mtime.second().try_into().unwrap(),
+            ).unwrap();
+            let options = options.last_modified_time(mtime_zip);
+
             if !backup.includes_file(file.path.render()) {
                 log::debug!("[{}] skipped: {}", self.mapping.name, file.path.raw());
                 continue;
sluedecke commented 1 year ago

I will continue to work on the item - still learning rust. It is doable without saving the file times, but since zip has a precision of 2 seconds only there will be some fuzziness in the modification time. I'd recommend to have it exact, but what do you think about that?

pub struct ZipFileData {
...
    /// Last modified time. This will only have a 2 second precision.
    pub last_modified_time: DateTime,
...
sluedecke commented 1 year ago

I am really uncomfortable with relying on DateTime from the ZipFileData, besides the 2 second precision there is no timezone information at all. There are warnings about this in the source code e.g. from zip-0.6.2/src/types.rs.

What do you think? Map it to UTC and accept the fuzziness?

/// Representation of a moment in time.
///
/// Zip files use an old format from DOS to store timestamps,
/// with its own set of peculiarities.
/// For example, it has a resolution of 2 seconds!
///
/// A [`DateTime`] can be stored directly in a zipfile with [`FileOptions::last_modified_time`],
/// or read from one with [`ZipFile::last_modified`]
///
/// # Warning
///
/// Because there is no timezone associated with the [`DateTime`], they should ideally only
/// be used for user-facing descriptions. This also means [`DateTime::to_time`] returns an
/// [`OffsetDateTime`] (which is the equivalent of chrono's `NaiveDateTime`).
///
/// Modern zip files store more precise timestamps, which are ignored by [`crate::read::ZipArchive`],
/// so keep in mind that these timestamps are unreliable. [We're working on this](https://github.com/zip-rs/zip/issues/156#issuecomment-652981904).
#[derive(Debug, Clone, Copy)]
pub struct DateTime {
    year: u16,
    month: u8,
    day: u8,
    hour: u8,
    minute: u8,
    second: u8,
}
sluedecke commented 1 year ago

I pushed a commit which relies only on the timestamps in the backups themselves. It tests ok for my system, but I still need to remove the mtime field from ScannedFile and IndividualMappingFile and do some more cleanup.

mtkennerly commented 1 year ago

Thanks! That looks pretty good :+1:

What do you think? Map it to UTC and accept the fuzziness?

Yeah, I think that should be okay. Convert the mtime to UTC when we store it in the zip, then convert it back to the local time zone when we extract it from the zip. The fuzziness isn't ideal, but I don't think it will hurt anything, unless a game is spitting out multiple save files per second and sorts by mtime.

// SL: I wonder which circumstances will have a ro target_file... maybe
// a Windows specific issue?

I'm not sure if it's Windows-only, but it did originally come up on Windows.

sluedecke commented 1 year ago

Hm ... I just merged my branch with the latest from ludusavi/master and get this:

$ RUST_BACKTRACE=1 ./ludusavi  backup --preview Factorio
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/cache.rs:44:44
stack backtrace:
   0: rust_begin_unwind
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:142:14
   2: core::result::unwrap_failed
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/result.rs:1805:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/result.rs:1098:23
   4: ludusavi::cache::Cache::save
             at /home/MYUSER/Projekte/contributing/ludusavi/src/cache.rs:44:27
   5: ludusavi::cache::Cache::migrated
             at /home/MYUSER/Projekte/contributing/ludusavi/src/cache.rs:88:13
   6: ludusavi::cli::run_cli
             at /home/MYUSER/Projekte/contributing/ludusavi/src/cli.rs:687:5
   7: ludusavi::main
             at /home/MYUSER/Projekte/contributing/ludusavi/src/main.rs:69:29
   8: core::ops::function::FnOnce::call_once
             at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

config directory looks like this:

$ ls -la  /home/MYUSER/.config/ludusavi
  total used in directory 11436 available 54 GiB
  drwxr-x---   2 MYUSER MYUSER    20480 Oct  5 23:21   .
  drwx------ 347 MYUSER MYUSER    20480 Oct  5 12:13   ..
  -rw-r-----   1 MYUSER MYUSER     6622 Oct  3 12:53   config.yaml
  -rw-r--r--   1 MYUSER MYUSER 11656754 Oct  3 12:53   manifest.yaml
sluedecke commented 1 year ago

This one let's it run and generate a cache.yaml:

diff --git a/src/cache.rs b/src/cache.rs
index c7a2d7f..b1fa343 100644
--- a/src/cache.rs
+++ b/src/cache.rs
@@ -41,7 +41,7 @@ impl Cache {
     pub fn save(&self) {
         let new_content = serde_yaml::to_string(&self).unwrap();

-        let old_content = Self::load_raw().unwrap();
+        let old_content = Self::load_raw().unwrap_or_default();
         if old_content == new_content {
             return;
         }
mtkennerly commented 1 year ago

Good catch! I must have forgotten to re-test the cache migration after changing the save logic.

mtkennerly commented 1 year ago

Fixed in master: https://github.com/mtkennerly/ludusavi/commit/ba7ca848ce493f5167e5bc126ea43ee55eb9bb01

mtkennerly commented 1 year ago

Completed in PR #136.

sluedecke commented 1 year ago

Thanks for accepting my pull request!