fsck: option to delete or repair corrupt files

rfjakob commented 6 years ago

Ticket https://github.com/rfjakob/gocryptfs/issues/191 was closed once read-only fsck was implemented. This ticket is for a read-write fsck.

Options for fixing corrupt files:

Delete
Move to lost+found folder
Overwrite corrupt block with zeros

brainchild0 commented 5 years ago

If corruption is block-level, making it possible to keep data in part of the file, then clearing the corrupted block (in the decrypted view) is extremely useful, compared to simply removing the file.

Currently, an application scanning a file with corruption on the gocryptfs layer will exit as soon as it reaches corrupted the part of the file. As such, even if the file has recovery features, the error produced by the file system prevents any application from invoking these features. Meanwhile, in the case where the application can read the block and discover the missing data, it can attempt a recovery.

The lost and found folder is a nice idea, but if encryption has failed, then I cannot see any benefit to keeping the data that cannot be decrypted. Saving the corrupted parts seems to me no more useful than saving random data. Of course, I am not an expert in cryptography.

rfjakob commented 5 years ago

Well, you can use ddrescue to recover the good blocks

brainchild0 commented 5 years ago

That is helpful to know. I was mainly responding to the three options originally offered by you.

And one more comment on lost and found. While I see little value in saving the corrupt data, I see much value in saving the operation log. I find Microsoft's disk check tool immensely frustrating because, unless you capture the console output, you are left without any record of which files have lost data.

rfjakob commented 5 years ago

Saving a log is a very good idea, thanks!

Maybe save the corrupt (or repaired file) as lost+found/repair0001 , and repair0001.log next to it, containing the original file path, which blocks were repaired etc

brainchild0 commented 5 years ago

I am not sure completely how file-system repair tools generally populate the repair directories they create. I know that these directories are sometimes used to link back orphaned files, which are not referenced in the directory listing, into the file system, so that they can be reviewed by the user and removed from danger of being reallocated.

I would be reluctant to move any file if part of that file has remained preserved. For example, suppose a file is being downloaded by a torrent application. If part of the file is repaired, by damaged blocks being cleared to zero, then the torrent application can detect the local problem from the checksum, and try to fetch the missing pieces. If the file is moved, then the application cannot find any of its data.

The ddrescue tool might be helpful in a pinch, but the convenience of fixing all files in place is vastly preferable to writing temporary copies of each one then moving back to the original location.

Edit: Also, in any case, I suggest avoiding overwriting any recovery data in subsequent operation. If a recovery folder created, it might be called recovery001, unless it exists, in which case, try recovery002, and so on.

paralin commented 3 years ago

How would you go about doing this with ddrescue?

lestephane commented 3 years ago

How would you go about doing this with ddrescue?

I had the same question so I looked into it. I'm adding it here in case it helps someone else.

Assuming that I ran an -fsck and I find that I have a log file (log.txt) that is corrupted because my disk was full.

I used the command:

ddrescue --sector-size=4096 log.txt log.txt.rescued
mv log.txt.rescued log.txt

@rfjakob: is that correct or did I miss something?

dumblob commented 3 years ago

I read the discussion here and I'm interested both in the delete & repair feature as well as in the questions raised here. Any news here?

krim404 commented 1 year ago

any possible update on this topic?

OdinVex commented 1 year ago

I'd recommend avoiding gocryptfs and use VeraCrypt until this can be addressed.

rfjakob commented 1 year ago

@OdinVex VeraCrypt provides no integrity protection, so you won't even notice any corruption. Not sure that's better ¯\_(ツ)_/¯

OdinVex commented 1 year ago

@rfjakob You can fsck a mounted filesystem from a VC volume. It's something somewhere in the mess of things you can try. Especially considering my gocryptfs mount is useless, I couldn't delete anything, couldn't move anything, or edit anything/copy anything. I ended up just winging it and writing a script to just delete the cipher files that were problematic. Data loss, but gocryptfs is still limited and can only read-only scan with zero recoverability. Copied what I could off of the gocryptfs, now using VC. (You can at least work with a VC-mounted volume and not be faced with 'no read, no copy, only Zull'.

DrDaveD commented 1 year ago

VeraCrypt also has no FUSE filesystem so it can't be safely used by an untrusted user. So it depends on what the requirements are.

OdinVex commented 1 year ago

VeraCrypt also has no FUSE filesystem so it can't be safely used by an untrusted user. So it depends on what the requirements are.

That doesn't address gocryptfs's inabilities to delete, copy, edit, read, anything that is corrupt. Same boat as VC except with VC you get an fsync that can attempt something at SOME layer. Eg, better than absolutely nothing.

rfjakob commented 1 year ago

I couldn't delete anything, couldn't move anything, or edit anything/copy anything.

Oh, that sounds bad, sorry to hear that! Do you remember how this happended?

What probably would have been possible is to mount with -badname=*. This way you can see (and delete) the corrupt files. They will be marked with GOCRYPTFS_BAD_NAME in the file name.

OdinVex commented 1 year ago

I couldn't delete anything, couldn't move anything, or edit anything/copy anything.

Oh, that sounds bad, sorry to hear that! Do you remember how this happended?

I rebooted.

What probably would have been possible is to mount with -badname=*. This way you can see (and delete) the corrupt files. They will be marked with GOCRYPTFS_BAD_NAME in the file name.

Already deleted all of the corrupt items. I keep external backups now on a VC-encrypted drive, plus a VC-encrypted SAN volume.

rfjakob commented 1 year ago

A reboot caused corruption!? What gocryptfs version is this?

On Tue, 18 Jul 2023, 15:36 Odin Vex, @.***> wrote:

I couldn't delete anything, couldn't move anything, or edit anything/copy anything.

Oh, that sounds bad, sorry to hear that! Do you remember how this happended?

I rebooted.

What probably would have been possible is to mount with -badname=*. This way you can see (and delete) the corrupt files. They will be marked with GOCRYPTFS_BAD_NAME in the file name.

Already deleted all of the corrupt items. I keep external backups now on a VC-encrypted drive, plus a VC-encrypted SAN volume.

— Reply to this email directly, view it on GitHub https://github.com/rfjakob/gocryptfs/issues/263#issuecomment-1640243163, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA7YPEG4WMM64ARETZUTXQ2GNRANCNFSM4FWWN65A . You are receiving this because you were mentioned.Message ID: @.***>

OdinVex commented 1 year ago

A reboot caused corruption!? What gocryptfs version is this?

I suspect filesystem corruption, though fsck on main filesystem (also an md-raid) was fine, so I don't know. As for version, I don't know when the corruption started outside of a few days ago and nothing has happened other than a reboot (after which, I always immediately re-mount my Plasma Vault). Current version claims gocryptfs v2.4.0; go-fuse [vendored]; 2023-06-15 go1.20.5 linux/amd64. gocryptfs fsck needs work, to provide a lot more debug information about any/all failures, including full cipher entry path uri, what it should have been (to the best able to be described), and of course the nature of the issue. Manjaro x64, latest.

rfjakob commented 1 year ago

So the reboot was a power loss / crash? Are you using xfs as the backing filesystem by any chance?

About the fsck output, I thought it's already pretty comprehensive, but I'll check again. I guess the ciphertext paths are missing.

On Tue, 18 Jul 2023, 15:55 Odin Vex, @.***> wrote:

A reboot caused corruption!? What gocryptfs version is this?

I suspect filesystem corruption, though fsck on main filesystem (also an md-raid) was fine, so I don't know. As for version, I don't know when the corruption started outside of a few days ago and nothing has happened other than a reboot (after which, I always immediately re-mount my Plasma Vault). Current version claims gocryptfs v2.4.0; go-fuse [vendored]; 2023-06-15 go1.20.5 linux/amd64. gocryptfs fsck needs work, to provide a lot more debug information about any/all failures, including full cipher entry path uri, what it should have been (to the best able to be described), and of course the nature of the issue.

— Reply to this email directly, view it on GitHub https://github.com/rfjakob/gocryptfs/issues/263#issuecomment-1640277472, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA755Q5CQ3M6P6AZWZGLXQ2ITTANCNFSM4FWWN65A . You are receiving this because you were mentioned.Message ID: @.***>

OdinVex commented 1 year ago

So the reboot was a power loss / crash? Are you using xfs as the backing filesystem by any chance? About the fsck output, I thought it's already pretty comprehensive, but I'll check again. I guess the ciphertext paths are missing. … On Tue, 18 Jul 2023, 15:55 Odin Vex, @.> wrote: A reboot caused corruption!? What gocryptfs version is this? I suspect filesystem corruption, though fsck on main filesystem (also an md-raid) was fine, so I don't know. As for version, I don't know when the corruption started outside of a few days ago and nothing has happened other than a reboot (after which, I always immediately re-mount my Plasma Vault). Current version claims gocryptfs v2.4.0; go-fuse [vendored]; 2023-06-15 go1.20.5 linux/amd64. gocryptfs fsck needs work, to provide a lot more debug information about any/all failures, including full cipher entry path uri, what it should have been (to the best able to be described), and of course the nature of the issue. — Reply to this email directly, view it on GitHub <#263 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACGA755Q5CQ3M6P6AZWZGLXQ2ITTANCNFSM4FWWN65A . You are receiving this because you were mentioned.Message ID: @.>

It was a safe shutdown and reboot. gocryptfs fsck output is actually quite sparse. It doesn't print out full paths to the corrupt cipher entries. I had to build an array of all possible filenames based on the corrupt ones (fortunately exact match) and delete them. Wrote a hard-coded script to automate and judge that with bail-outs everywhere, just in case. After the gocryptfs volume acted funny, I did a main fs fsck, it was fine, checked journal, no damage or repairs. But none of that is the point. The point is that gocryptfs fsck is too limited in its ability to fix stuff and the gocryptfs is a bit broken if you can't edit/copy/delete/move something because of a corrupt entry, you're grid-locked and have to dismount, fine the corrupt entries, delete them, then try to merry on your way without them. It isn't a healing filesystem, of course.

Edit: No clue why it happened, not even concerned with that (assuming it could've been anything on my own end as far as a rogue poorly-escaped shell line, who knows), but the point is entirely about gocryptfs's healing (if it ever aims to) and fsck being too limited to repair anything, currently.

rfjakob / gocryptfs

fsck: option to delete or repair corrupt files #263