microsoft / git

A fork of Git containing Microsoft-specific patches.
http://git-scm.com/
Other
761 stars 92 forks source link

status: Hydration percentage not printed when gvfs status cache is used #670

Open jeffhostetler opened 2 months ago

jeffhostetler commented 2 months ago

In [1] we added code to print the GVFS hydration percentage in git status when core.virtualfilesystem is set. GVFS users may not see this message because of the "GVFS Status Cache" feature of the "GVFS Mount Daemon".

The GVFS mount daemon periodically runs a git status --serialize in the background (after certain file system operations) and write the result to a "status-cache-file". When GVFS users run git status it silently assume --deserialize and asks the mount daemon if there is a valid cache file and simply prints the cache result on the console. The foreground status command does not scan the worktree and it does not even load the index. All it needs to do is to decode the cache file and print it.

Since the foreground status command does not load the index, it cannot compute the hydration percentage (and to make matters confusing, in the foreground status command in wt_status_get_state() and wt_status_check_sparse_checkout() the variable r->index is non-null, but the fields within it are zero. Therefore r->index->cache_nr == 0. Therefore state->sparse_checkout_percentage is set to SPARSE_CHECKOUT_DISABLED. So in wt_status_print() we DO NOT emit the Trace2 sparse-checkout/percentage. We also DO NOT print any of the messages in show_sparse_checkout_in_use().

(The background git status --serialize command will Trace2 log them (stdout is closed, so no one will see the new print messages), so we can still get telemetry, but interactive users won't see the new feature.)

[1] https://github.com/microsoft/git/compare/78b268cc35d3bca418c38f4c2bb8798268858ea2...7975c98ed878b00ed538baa1bccaaaefc988a5ef

jeffhostetler commented 2 months ago

I don't want to force the foreground status command to read the index just to have index stats -- reading the index on the Windows repo is very expensive because of the size.

It would be relatively easy to add another field to the serialization format and let the background status command add it to the cache file. In the deserialize code, parse the value into a global variable (sigh, I know). Then in wt_status_check_sparse_checkout() check the global variable, set the state->sparse_checkout_percentage, and return -- before the first if statement in the existing function.

jeffhostetler commented 2 months ago

BTW, I already have some hydration stats in the data stream, so there may be some overlap.

https://github.com/microsoft/git/blob/vfs-2.45.2/virtualfilesystem.c#L374-L381