0x90d / videoduplicatefinder

Video Duplicate Finder - Crossplatform
1.82k stars 179 forks source link

[Help]: Is there a way to change all the paths in ScannedFiles.db #521

Open ps245 opened 1 month ago

ps245 commented 1 month ago

Environment

Question

If I did a scan on one PC and the mount point was in one location, is there an easy way to update the path in the database (maybe even externally) to a new location?

for example if i scanned as /media/alice/disk but then I wanted to use the same scanned files databse on a different computer with a different username maybe /media/bob/disk. All the contents under ../disk are the same. It should be easy to change alice to bob in the database.

0x90d commented 1 month ago

There is no way to edit all path at once. You can either edit them one by one by clicking on Database -> Edit... or perhaps using a hex editor on ScannedFiles.db. If you're familiar with C# you could also download the source and add some lines that changes all paths for you.

Maltragor commented 1 month ago

My spontaneous suggestion would be to export the database as Json, change the paths with a text editor using search and replace and then import it again using Import From Json.

Editing the ScannedFiles.db directly via hex editor would probably only work if the length of the file paths is not changed by the replacement.

Or as an alternative to changing the ScannedFiles.db, you could also simply make the /media/bob/disk directory additively accessible under /media/alice/disk in the file system either via symlink ("ln -s") or via "mount --bind".

In principle, the ScannedFiles.db (in the format of Protocol Buffers (protobuf)) could also be changed with a few Python lines, for example. I tried it for fun and at least at first glance the modified file still seemed to be readable by VDF. I am attaching the script, although it is really only a minimal test. (I just wanted to know if / how easy it is to change a protobuf file with Python without using information about the data structure stored in it).

#!/usr/bin/env python3

import sys
import blackboxprotobuf        # pip install bbpb

# Configuration
#--------------------------------------------------------
dbPath          = "/home/me/Downloads/VDF"
pathToReplace   = "/media/alice/disk"
pathReplacement = "/media/bob/disk"
#--------------------------------------------------------

# Open Database file
with open(dbPath + "/ScannedFiles.db", mode='rb') as file:
    ScannedFilesDb = file.read()

# Decode Protocol Buffers
message,typedef = blackboxprotobuf.decode_message(ScannedFilesDb)

# There is a "Folder" and "Path" (= Folder + File) entry per file
# => Search strings with the property that one (Path) starts with the other (Folder)
idPath = -1
idFile = -1
for idx in message['1'][0]:
    for idy in message['1'][0]:
        x = message['1'][0][idx]
        y = message['1'][0][idy]
        if idx != idy and isinstance(x,str) and isinstance(y,str) and x.startswith(y):
            idPath = idy
            idFile = idx
            break
    else:
        continue
    break
else:
    raise Exception("Unexpected structure") 

# Replace in Folder and then replace Path with modified Folder
for file in message['1']:
    newPath = file[idPath].replace(pathToReplace, pathReplacement)
    file[idFile] = file[idFile].replace(file[idPath], newPath)
    file[idPath] = newPath

# Encode Protocol Buffers
ScannedFilesDb = blackboxprotobuf.encode_message(message,typedef)

# Store Database file
with open(dbPath + "/Modified_ScannedFiles.db", "wb") as file:
    file.write(ScannedFilesDb)