gitbutlerapp / gitbutler

The GitButler version control client, backed by Git, powered by Tauri/Rust/Svelte
https://gitbutler.com
Other
13.24k stars 527 forks source link

Corrupted virtual_branches.toml file #5391

Open krlvi opened 1 week ago

krlvi commented 1 week ago

In some rare conditions the .git/gitbutler/virtual_branches.toml file becomes corrupted. From discord: https://discord.com/channels/1060193121130000425/1206670506271707156/1301690618606518273

It appears that in that particular instance, the ownership field contained invalid data (the raw contents of an a source file)

The corrupted virtual_branches.toml file looked like this:

[branches.a00842ba-8513-446d-b14b-ef0305b4eae2]
id = "a00842ba-8513-446d-b14b-ef0305b4eae2"
name = "<name redacted>"
notes = ""
source_refname = "<name redacted>"
upstream = "<name redacted>"
upstream_head = "2e6034b1a252a637ef3041e31133d881941c42a6"
created_timestamp_ms = "1728516455078"
updated_timestamp_ms = "1730333193029"
tree = "c18d786c0c743f85e6b0094700f95f2cbd88f28a"
head = "a8238cd4e4f831cd4613faec34c72f211ec76080"
ownership = ""
<contents of a source file redacted>

<file name redacted>:0-0-d41d8cd98f00b204e9800998ecf8427e
<file name redacted>:0-0-7b41e51a67462fa4ff0df32d4114c30d
<file name redacted>:10-42-666cef007ac3069914eccfc872d66e14,1-9-b8235092a073f8464f8824ef9d58c4e4
<file name redacted>:365-382-1ca02041ee665036095293b74e71c018
<file name redacted>:93-100-2dbe531c0a6411d3689161374f891d98
'''
order = 0
selected_for_changes = 1728516455078
allow_rebasing = true
in_workspace = true

[[branches.a00842ba-8513-446d-b14b-ef0305b4eae2.heads]]
name = "<name redacted>"

The part with <contents of a source file redacted> is the unexpected bit, where it was 42 lines of source that is part of the tracked files. Looking at the code here https://github.com/gitbutlerapp/gitbutler/blob/790ecd9261d0b5cbb01828db4ecb01c46efcd24d/crates/gitbutler-stack/src/stack.rs#L66 I just don't see how this can happen

krlvi commented 1 week ago

@Byron do you have an intuition about what could possibly be happening here?

Byron commented 1 week ago

Do I understand correctly that the correct version of the ownership field of this file would look like this?

ownership = '''
<file name redacted>:0-0-d41d8cd98f00b204e9800998ecf8427e
<file name redacted>:0-0-7b41e51a67462fa4ff0df32d4114c30d
<file name redacted>:10-42-666cef007ac3069914eccfc872d66e14,1-9-b8235092a073f8464f8824ef9d58c4e4
<file name redacted>:365-382-1ca02041ee665036095293b74e71c018
<file name redacted>:93-100-2dbe531c0a6411d3689161374f891d98
'''

If so, I interpret ownership = "" as an empty string, and then… data just gets plastered all over, and it doesn't make sense to me.

I have no intuition either, but would assume that it can't be memory corruption thanks to Rust. But how actual source code could slip in to an ownership claim… I'd probably have to check how that code looks like that creates it, and how it is converted. But what I could never explain is how the file-format could be so wrong… unless of course there is a bug in the TOML serialization so that it's not properly dealing with input strings.

Maybe this is two bugs… one that somehow allows parts of source code to enter ownership claims, and another one in the TOML serialization that causes invalid files when a particularly 'special' string is given as values of a field.

krlvi commented 1 week ago

My hypothesis is that the app was interrupted somehow while writing to disk, since there was an app update/restart involved 🤔

Byron commented 1 week ago

But aren't writes still done using the tempfile mechanism? It prevents partial writes effectively, at least assuming that mv a b is still an atomic operation that can't be interrupted. That's the case on Unixes at least, but maybe this issue occurred on Windows?