Open dlqqq opened 1 year ago
I will need consensus on whether we're OK with imposing the constraint that all File ID manager implementations use SQLite. It's not clear if a general migration strategy would be possible otherwise. Furthermore, if we impose this constraint, we can merge the duplicate __init__()
logic in ArbitraryFIDM
and LocalFIDM
into BaseFIDM
, and then perform migrations for all custom FIDMs automatically (as long as custom FIDMs meaningfully implement the import and export methods).
Base: 85.74% // Head: 85.48% // Decreases project coverage by -0.25%
:warning:
Coverage data is based on head (
6fca139
) compared to base (50dec2e
). Patch coverage: 86.99% of modified lines in pull request are covered.
:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
Interesting edge case when migrating from a local FIDM to arbitrary FIDM on Windows. Essentially, local FIDM knows that this is a local Windows path, and calls os.path.normpath()
on it to map it to lowercase (since there's no distinction b/w upper and lower case in Windows filesystems).
However, arbitrary FIDM doesn't make any assumptions of the local filesystem and is case sensitive even when running on Windows, and hence is not able to return a relative path since it sees the lower-case content root as being different from the content root.
_______________________ test_migrate_local_to_arbitrary _______________________
fid_db_path = 'C:\\Users\\runneradmin\\AppData\\Local\\Temp\\pytest-of-runneradmin\\pytest-0\\test_migrate_local_to_arbitrar0\\data\\fileidmanager_test.db'
jp_root_dir = WindowsPath('C:/Users/runneradmin/AppData/Local/Temp/pytest-of-runneradmin/pytest-0/test_migrate_local_to_arbitrar0/root_dir')
test_path = 'test_path', test_path_child = 'test_path/child'
def test_migrate_local_to_arbitrary(fid_db_path, jp_root_dir, test_path, test_path_child):
local = LocalFileIdManager(db_path=fid_db_path, root_dir=str(jp_root_dir))
local.con.execute("PRAGMA journal_mode = off")
id_1 = local.index(test_path)
id_2 = local.index(test_path_child)
del local
arbitrary = ArbitraryFileIdManager(db_path=fid_db_path, root_dir=str(jp_root_dir))
> assert arbitrary.get_path(id_1) == test_path
E AssertionError: assert 'c:/users/runneradmin/appdata/local/temp/pytest-of-runneradmin/pytest-0/test_migrate_local_to_arbitrar0/root_dir/test_path' == 'test_path'
E - test_path
E + c:/users/runneradmin/appdata/local/temp/pytest-of-runneradmin/pytest-0/test_migrate_local_to_arbitrar0/root_dir/test_path
I just patched the test to fix the above error. I'm not sure if this edge case is worth handling, given that the motivation is unclear (downgrading on a local filesystem) and that the workaround is as simple as providing a normalized contents root on Windows.
Description
export_rows()
: a static method that returns a generator yielding a tuple of[id, path]
import_rows()
: an instance method that writes a tuple yielded byexport_rows()
to the databaseArbitraryFIDM
andLocalFIDM
support migrations between each other by default. Upon initialization, both implementations write their module name and class name to a "database manifest", which allows for subsequent reflection by the other FIDM. If a manifest already exists and it has a different module or class name, we perform a migration from the previous FIDM to the current FIDM. The migration strategy is roughly as follows:2022-11-09T01:55:32-file_id_manager.db
. Bind this tobackup_db_path
.PrevManager = getattr(importlib.import_module(prev_module), prev_classname)
The key principle is that the previous FIDM defines how to export the rows in its database, and the current FIDM defines how to import the rows into its database.
Limitations
Currently, when switching FIDMs, we rely on the fact that the previous FIDM writes its own manifest to the database file. If custom FIDMs forget to do this, the migration will fail. TODO: maybe move the manifest creation logic into
BaseFileIdManager.__init__()
?Requires the previous FIDM implementation to use SQLite and not write some other database file format (e.g. MariaDB, PostgresQL) to the path. I think this is constraint is OK to impose.
Open questions
Related issues