How to handle failing hard drive.

bryanlyon commented 4 years ago

I have a SecureFS secured folder on a drive that is failing. Nothing super critical is on the drive, but I want to make sure that I get as much off as possible while also identifying any bad data. I have copied as much data as possible off the drive as was immediately feasible but some files failed to copy with read errors. I also still have the json file with the encryption keys and the password.

I can mount the backup of the data, but I don't want to use the original drive any more than necessary. HOWEVER, I do have a list of files that failed to copy over to the backup. These filenames are encrypted but I need to know the unencrypted filenames. Is there an easy way to recover the unencrypted filename from the encrypted path+filename without mounting the original folder?

In addition, I might be able to get more data from the drive by using software like ddrescue, but I assume that unless it's a perfect recovery that I'm going to lose chunks of the file. Does SecureFS have a way to verify that decryption was correct (I.E. is a given block or file's decryption verifiable)?

I know that this is a local problem and not the fault of SecureFS itself but would still like to know what can be done. Thanks.

netheril96 commented 4 years ago

You can mount with options --trace and run ls. Then in the log you will find this kind of line

 Translate path /passes into JWMD2AXINPT3Z4K4PS6W8R6GV5MYCP8FK9HS

bryanlyon commented 4 years ago

Thank you for responding.

Part of the problem is that the folder that I can mount (the backup) the (obviously) damaged files are missing. In other words, if there was a read error, then no file exists. I simply have them left as paths which are broken. I guess could recreate them as placeholder files, but would those mount and resolve with SecureFS?

Secondly, In addition to the explicit read errors, I worry that some files might have silently corrupted. This is why I asked about if SecureFS does any verification on the blocks as I'd like to run a check to verify all the data.

bryanlyon commented 4 years ago

I've answered at least part of the second part of my question. Yes, SecureFS does keep track of chunks by checksum, so a corrupt read will show up as a failure to read the file and also log

[Error] [0x7f6c19119700] [2020-08-07 23:19:16.417081084 UTC] [int securefs::lite::read(const char*, char*, size_t, off_t, fuse_file_info*):259] read path=/badfile.rar offset=7499776 length=4096 encounters exception securefs::LiteMessageVerificationException (code=1): File content has invalid checksum

Of note, this also throws a read error to the program that is attempting to read the file. But it only works when actually reading the whole the file and not just on ls. So I can find any bad files by attempting to md5sum all of the files and catching the errors.

$md5sum ./badfile.rar md5sum: ./badfile.rar: Operation not permitted

Because of that, with a basic script I can verify every file. This is not a perfect solution, and it wont help with recovery but it does at least identify files that "successfully" copied but provided bad data.

I think a way to verify data would be a good addition to SecureFS as a way to ensure data consistency though I know it'll be no faster than my horrible script here.

In addition, I went through and loaded my backup of the secure folder with trace on and enumerated every file. This created a mapping in the log for every file in the Backup. However, I still have several known bad filenames that I wasn't able to recover by checking the mapping in the logs. Presumably this is because they failed to copy from the original source. Is there a way to identify these files from that filename without having the original file?

Thanks again.

netheril96 commented 4 years ago

I think a way to verify data would be a good addition to SecureFS as a way to ensure data consistency though I know it'll be no faster than my horrible script here.

I don't understand this sentence. securefs already verifies the content of files. In fact, it is more stringent than most checksums, because the checksum in securefs is cryptographically strong.

Is there a way to identify these files from that filename without having the original file?

The simplest way to create a repo with the exactly the same filenames as your original but with all file contents empty, mount it, run ls, and inspect the logs. The encryption of filenames and contents are independent.

bryanlyon commented 4 years ago

I don't understand this sentence. securefs already verifies the content of files. In fact, it is more stringent than most checksums, because the checksum in securefs is cryptographically strong.

My recommendation would be a command line switch (perhaps --scrub after Filesystems like ZFS) that goes through and verifies all data is still good file by file and prints/logs which files it fails on. I know it may be excessive, but I still think it's a useful tool.

AGenchev commented 3 years ago

@bryanlyon Since you cite ZFS, it has something else which could help in your case: ECC code / redundancy built-in. So if we had this in securefs, in theory you could recover some files with bad blocks with the ECC metadata help. The modern 4k hard drives already implement ECC though. So if something fails, it should be studied whether entire 4k sectors will get lost without any valid bit of information in it to think on possible recovery solution. At least the name "securefs" suggests that such functionality is not out of scope. Also, many people would like data compression as well.

netheril96 commented 2 years ago

@AGenchev Are you suggesting that securefs provides the ability to add ECC too?

As for compression, I won't do that. It leaks information and breaks security.

AGenchev commented 2 years ago

I am not sure whether I suggest it as it might bring much higher complexity to the filesystem, for example when a file is updated and some blocks are rewritten. ECC on the media matters if you're unsure in the media, but you have taken other measures like ECC RAM, non-overclocked system, etc.

netheril96 / securefs

How to handle failing hard drive. #111