sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.86k stars 128 forks source link

How can i recover deleted files #587

Closed tambaTech closed 1 year ago

tambaTech commented 1 year ago

On Macbook Pro terminal I'm a newbie and didn’t have the required knowledge on rmlint. This why i deleted som files i should have. Is it possible to recover all the files again ?

cebtenzzre commented 1 year ago

In general, you can install testdisk from homebrew and run photorec. But rmlint doesn't delete any unique files by default. What did the rmlint command you ran look like? Do you still have the shell script or JSON file from it?

tambaTech commented 1 year ago

In general, you can install testdisk from homebrew and run photorec. But rmlint doesn't delete any unique files by default. What did the rmlint command you ran look like? Do you still have the shell script or JSON file from it?

Yes, I still have the JSON file. I ran the command line rmlint and then ./rmlint.sh and Yes!

cebtenzzre commented 1 year ago

If you put this script in the same directory as rmlint.json and run it with python3 rmlint_undo.py, it should be able to restore anything that was removed by rmlint.sh. If there are any WARNING lines or it does not finish with "Successfully restored [n] files.", let me know.

tambaTech commented 1 year ago

ok, thank very much

tambaTech commented 1 year ago

If you put this script in the same directory as rmlint.json and run it with python3 rmlint_undo.py, it should be able to restore anything that was removed by rmlint.sh. If there are any WARNING lines or it does not finish with "Successfully restored [n] files.", let me know.

ls
Applications Library Postman rmlint.sh Desktop Movies Public rmlint_undo.py Developer Music ThePythonCodingBook Documents OneDrive lamin.itemcolors Downloads Pictures rmlint.json

when er run the command i get the below error message :

python3 rmlint_undo.py ✔ 17:37:19 Traceback (most recent call last): File "/Users/lamin/rmlint_undo.py", line 8, in doc = json.load(f) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/init.py", line 293, in load return loads(fp.read(), File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 201242 column 4 (char 6288624)

cebtenzzre commented 1 year ago

It seems like your JSON file is incomplete - were you low on disk space when you ran rmlint? Unfortunately, it currently doesn't even warn when this happens - I'm looking into it.

What is the output of this command?

head -n 201243 rmlint.json | tail -n 3
tambaTech commented 1 year ago

~ head -n 201243 rmlint.json | tail -n 3 ✔ 18:08:35 "twins": 2, "mtime": 1460997703 }, %

It seems like your JSON file is incomplete - were you low on disk space when you ran rmlint? Unfortunately, it currently doesn't even warn when this happens - I'm looking into it.

What is the output of this command?

head -n 201243 rmlint.json | tail -n 3

It seems like your JSON file is incomplete - were you low on disk space when you ran rmlint? Unfortunately, it currently doesn't even warn when this happens - I'm looking into it.

What is the output of this command?

head -n 201243 rmlint.json | tail -n 3

~ head -n 201243 rmlint.json | tail -n 3 ✔ 18:08:35 "twins": 2, "mtime": 1460997703 }, %

cebtenzzre commented 1 year ago

Yeah, it looks truncated. I think this command will fix the JSON file:

printf '{\n}\n]\n' >>rmlint.json

If that doesn't fix it, give me more lines with tail -n 15 rmlint.json so I can understand what's broken.

To be clear, you cannot fully restore your files because the rmlint.json is missing an unknown number of them. The script should be able to restore about 7,000 files once you've repaired the JSON file.

tambaTech commented 1 year ago

tail -n 15 rmlint.json

tail -n 15 rmlint.json 1 х 18:30:18 "is_original": false, "twins": 2, "mtime": 1460997703 }, { } ] { } ] { } ] { } ]

cebtenzzre commented 1 year ago

That command was only meant to be run once. You're going to have to remove the last few lines from the file, so it ends like this:

"mtime": 1460997703
}, {
}
]
tambaTech commented 1 year ago

all my Xcode projects and my Django files are deleted. Lost all my work for the past months 😱

tambaTech commented 1 year ago

all my Xcode projects and my Django files are deleted. Lost all my work for the past months 😱

tambaTech commented 1 year ago

Please please help me recover

cebtenzzre commented 1 year ago

Please run python3 rmlint_undo.py again once you've fixed the JSON file. That will restore some of your files. But again, rmlint won't ever remove unique files with default options. If you open the JSON file in a text editor, you'll see "is_original" : true, lines within a few lines of the path to files that rmlint did not delete - those are your files, just in different places. I'm sure there's some kind of pattern, like a backup copy in a different location.

tambaTech commented 1 year ago

python3 rmlint_undo.py

python3 rmlint_undo.py ✔ 18:30:45 Traceback (most recent call last): File "/Users/lamin/rmlint_undo.py", line 8, in doc = json.load(f) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/init.py", line 293, in load return loads(fp.read(), File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 340, in decode raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data: line 201245 column 1 (char 6288630)

cebtenzzre commented 1 year ago

Have you opened the file in a text editor, and fixed the end of the file, as I recommended here? It needs to look exactly like the example I gave you.

tambaTech commented 1 year ago

python3 rmlint_undo.py 1 х 18:40:19 Traceback (most recent call last): File "/Users/lamin/rmlint_undo.py", line 16, in originals[file['checksum']] = file['path'] KeyError: 'checksum'

tambaTech commented 1 year ago

i'm getting "is_original": false:

{ "id": 3708842510, "type": "emptydir", "progress": 0, "path": "/Users/lamin/Library/Containers/com.apple.iBooksX/Data/Documents/com.apple.app-analytics.books-bag.upload-dropbox/D3558EB8-202E-4CDD-BED9-723F1936C090", "size": 0, "depth": 7, "inode": 43274950, "disk_id": 16777234, "is_original": false, "mtime": 1662482742.7763021 },

cebtenzzre commented 1 year ago

Yes, there are originals (true) and non-originals (false). Original files are the files it kept, non-originals are the matching duplicates it removed (aside from lint types like empty dirs and files which of course don't have any data).

I've uploaded a new version of the undo script here that should be able to work around the checksum problem.

tambaTech commented 1 year ago

Yes, there are originals (true) and non-originals (false). Original files are the files it kept, non-originals are the matching duplicates it removed (aside from lint types like empty dirs and files which of course don't have any data).

I've uploaded a new version of the undo script here that should be able to work around the checksum problem.

File "/Users/lamin/rmlint_undo.py", line 43, in os.mkdir(path) FileNotFoundError: [Errno 2] No such file or directory: '/Users/lamin/Library/Group Containers/UBF8T346G9.Office/Outlook/Outlook 15 Profiles/Main Profile/LocalFiles/2957/2/Data'

~

cebtenzzre commented 1 year ago

That should be fixed in the new script here.

tambaTech commented 1 year ago

it did run the script but the files are not restored:

WARNING: cannot restore badlink:

cebtenzzre commented 1 year ago

The script cannot restore broken symlinks because the path they point to is not recorded. Chances are you don't want them back anyway. They aren't regular files with data, just pointers to other locations that no longer exist. There probably aren't many, so you should be able to manually review matches for badlink in rmlint.json.

tambaTech commented 1 year ago

The script cannot restore broken symlinks because the path they point to is not recorded. Chances are you don't want them back anyway. They aren't regular files with data, just pointers to other locations that no longer exist. There probably aren't many, so you should be able to manually review matches for badlink in rmlint.json.

What else can I do to restore the files?

cebtenzzre commented 1 year ago

Like I already said, search for "is_original" : true, in rmlint.json, and look at the corresponding (a few lines above) "path" values. If you have jq installed, you can use jq -r '.[1:-1][] | select(.is_original) | .path' rmlint.json to summarize that information. You can also do it with python:

printf 'import json\nfor e in json.load(open("rmlint.json"))[1:-1]:\n  if e["is_original"]: print(e["path"])' | python3

All of the printed paths are files that rmlint kept (and rmlint_undo.py restored). That should help you figure out where to look for extra copies, which you must have had for files to be removed in the first place.

tambaTech commented 1 year ago

I ran the command: printf 'import json\nfor e in json.load(open("rmlint.json"))[1:-1]:\n if e["is_original"]: print(e["path"])' | python3 it show a lot of filepath, which command would restore the files automatically ?

I'm sorry if i don’t understand and for being a pain for you.

cebtenzzre commented 1 year ago

What I'm inferring from the sheer number of files removed by rmlint is that you had copies of entires folders on your disk, not just random copies of individual files. rmlint removes duplicate files only, not files it considers to be original. For instance, if you see something like this in the output (especially if you add | sort to the end of the command to group the results better):

/Users/foo/my-backup/a.txt
/Users/foo/my-backup/b.txt
/Users/foo/my-backup/subfolder/c.txt

That would indicate that you had a backup called my-backup stored somewhere. You're going to have to manually look for this kind of thing in the list. It could could be automatically created, it could be a manual copy, who knows. But there is still at least one copy of each of your files.

If there are many files in the results you don't really care about, you could add | grep '\.cpp$' or | grep '\.py$' to look for C++ or Python files, for instance.

tambaTech commented 1 year ago

/Users/foo/my-backup/a.txt

Yes, you're right have copies of same files in different folders I'm bit lost 😞