microsoft / BuildXL

Microsoft Build Accelerator
MIT License
902 stars 141 forks source link

FileContentTable does not work optimally when file belongs to more than one FileContentTable #1297

Closed vladimir-cheverdyuk-altium closed 2 years ago

vladimir-cheverdyuk-altium commented 2 years ago

Imagine that I have some shared file and I have 2 different BuildXL projects but they both include that shared file as input. When I started a first BuildXL project, it will call FileContentTable.RecordContentHash on that shared file and that in turn will create close record and return its USN.

Then I will start second BuildXL project, it will call FileContentTable.RecordContentHash on that shared file and that in turn will create new close record and return new USN. Please note that this USN is different from USN we get for this file when I started first project.

Now if I start first BuildXL project, USN stored in FileContentTable became "stale" and as result, it will be re-hashed again and added by FileContentTable.RecordContentHash and that will add new close record and as result generate new USN that is different from what second project is record.

I think you already see that pattern here. File is not changed but it constantly re-hashing because I run first project and then second.

It is possible to reproduce just by using project and switching cacheDirectory between 2 different directories.

In my case 2 small C++ projects are using boost source code as shared input and it constantly re-hashing and it takes considerable time.

narasamdya commented 2 years ago

BuildXL has a feature (enabled by default) that will update the file content table by scanning the USN journal. This update happens when we initialize the scheduler (before we execute processes). The update should have taken care of Close record as well. So, I wonder why it doesn't work in your case.

Is there anyway I can repro the issue? It would be great if you have a small repro.

vladimir-cheverdyuk-altium commented 2 years ago

You can run HelloWorld project and pass /cacheDirectory:C:\Temp\1. Then run it again but pass cacheDirectory:C:\Temp\2. Then when you run it with cacheDirectory:C:\Temp\1 again, put breakpoint in TryGetKnownContentHash and RecordContentHash and you will see that it rebuilds and recalculates hashes. If you change yet again to cacheDirectory:C:\Temp\2 it will recalculate again.

narasamdya commented 2 years ago

Sorry for the delay.

I did some investigation on this issue. In short the hash recalculation is due to file rewrites in the HelloWorld project. However, there's a bit of non-trivial interplay with the file content table.

Note that, although you change the location of cache directory, BuildXL is running in a server mode, and all builds are using the same file content table (FCT) stored in the engine state of BuildXL server.

In the HelloWorld project, we have the following graph

helloPip -> file.out -> worldPip -> file.out

Note that worldPip is rewritting the output of helloPip.

You can enable more verbose log to see the following information by using /diag:Storage.

Recall that FCT is basically a map: FileID -> (USN, Hash).

Build 1: /cacheDirectory:Temp\1 In this build both pips execute, and each record its output to FCT:

// helloPip executes and records the following entry to FCT:
Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-0011000000250672 on D85EDD1F5EDCF76A) ->@ 0000001515205DB8 => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

// worldPip executes and records the following entry to FCT:
Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-001400000025067C on D85EDD1F5EDCF76A) ->@ 0000001515206F68 => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

The format is <path> (id <fileid> on <volume_serial> -> @ <usn> => content_hash.

After this build, our FCT now has the above two entries.

Build 2: /cacheDirectory:Temp\2

Because you change the cache directory, both pips will have cache misses.

// File content table is used from the engine state stored in BuildXL server.
Load file content table. Source: 'FromEngineState'. Loading time: 4ms

// helloPip has a cache miss because you change the cache directory.
// helloPip executes and produces file.out, with different file ID (but same hash as before), and records it to FCT:
Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-000F0000002506CF on D85EDD1F5EDCF76A) ->@ 0000001515212340 => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

// worldPip has a cache miss because you change the cache directory.
// worldPip executes and produces file.out, with different file ID (but same hash as before), and records it to FCT.
Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-00110000002506D5 on D85EDD1F5EDCF76A) ->@ 0000001515213258 => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

Note that the corresponding file.outs have the same hash as in Build 1, but because the pips execute they generate new files, which result in new file IDs.

At this point, our FCT has 4 entries due to 4 different file IDs.

Build 3: /cacheDirectory:Temp\1

In this build both pips will get cache hit. For both pips, the output file.out needs to be deployed from the cache. helloPip needs to deploy its output from the cache because the one on disk was the result of worldPip from the previous build. Similarly, since worldPip rewrites file.out, the final file.out needs to be deployed from the cache.

// File content table is used from the engine state stored in BuildXL server.
Load file content table. Source: 'FromEngineState'. Loading time: 0ms

// Temp\1 has file change tracker (different from file content table) that tells you file.out produced in build 1 has changed due to build 2.
Hard link 'E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out' has changed because its file id has changed, perhaps due to deletion or its target is modified

// helloPip tries to check if it can reuse file.out on disk, but FCT says that it has a different hash than expected.
Hit (matched known USN): Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-00110000002506D5 on D85EDD1F5EDCF76A) @ 0000001515213258 => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

// helloPip deploys outputs from the cache, but the one from the cache has a different file ID, it's the file ID in build 1.
// That file id has already existed in the FCT, with the same hash, but different USN, so BuildXL simply updates the USN.
Usn number is changed but content is unchanged: Path => E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out OldUSN => 0000001515206958 NewUSN => 0000001515217B38 ContentHash => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-0011000000250672 on D85EDD1F5EDCF76A) ->@ 0000001515217B38 => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

// worldPip tries to check if it can reuse file.out on disk, but FCT says that it has a different hash than expected. 
// Note that now the file ID is the one that helloPip just deployed from the cache.
Hit (matched known USN): Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-0011000000250672 on D85EDD1F5EDCF76A) @ 0000001515217B38 => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

// worldPip deploys outputs from the cache, but the one from the cache has a different file ID, it's the file ID in build 1.
// That file id has already existed in the FCT, with the same hash, but different USN, so BuildXL simply updates the USN.
Usn number is changed but content is unchanged: Path => E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out OldUSN => 0000001515211920 NewUSN => 0000001515217C28 ContentHash => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-001400000025067C on D85EDD1F5EDCF76A) ->@ 0000001515217C28 => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

Note that in the above trace the OldUSN of the file ID is different from the ones recorded in build 1. There are some changes that affect the USN but do not affect the content. The FCT update (by journal scanning using the file change tracker) feature that I mentioned in the above thread allows FCT to be updated with the latest USN if the change doesn't affect the content. This is visible in BuildXL.stats file:

// These two entries are for (id 0000000000000000-0011000000250672 on D85EDD1F5EDCF76A) and
// (id 0000000000000000-001400000025067C on D85EDD1F5EDCF76A)
FileContentTable.NumUpdatedUsnEntriesByJournalScanning=2

At this point, our FCT still have 4 entries because in the build above we simply update the entries instead of adding new ones.

Build 4: /cacheDirectory:Temp\2

This build has the same behavior as build 3. However, instead of dealing with files with file IDs produced in build 1, this build deals with files with file IDs produced in build 2

// File content table is used from the engine state stored in BuildXL server.
Load file content table. Source: 'FromEngineState'. Loading time: 1ms

// Temp\2 has file change tracker (different from file content table) that tells you file.out produced in build 2 has changed due to build 3.
Hard link 'E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out' has changed because its file id has changed, perhaps due to deletion or its target is modified

// helloPip tries to check if it can reuse file.out on disk, but FCT says that it has a different hash than expected.
Hit (matched known USN): Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-001400000025067C on D85EDD1F5EDCF76A) @ 0000001515217C28 => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

// helloPip deploys outputs from the cache, but the one from the cache has a different file ID, it's the file ID in build 2.
// That file id has already existed in the FCT, with the same hash, but different USN, so BuildXL simply updates the USN.
Usn number is changed but content is unchanged: Path => E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out OldUSN => 0000001515212C10 NewUSN => 000000151521C150 ContentHash => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-000F0000002506CF on D85EDD1F5EDCF76A) ->@ 000000151521C150 => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

// worldPip tries to check if it can reuse file.out on disk, but FCT says that it has a different hash than expected. 
// Note that now the file ID is the one that helloPip just deployed from the cache.
Hit (matched known USN): Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-000F0000002506CF on D85EDD1F5EDCF76A) @ 000000151521C150 => 3102042CB36F615A0C18B770CD71553C9F42167AA0F4DFE827E8043C0B86AAB600

// worldPip deploys outputs from the cache, but the one from the cache has a different file ID, it's the file ID in build 2.
// That file id has already existed in the FCT, with the same hash, but different USN, so BuildXL simply updates the USN.
Usn number is changed but content is unchanged: Path => E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out OldUSN => 0000001515217A98 NewUSN => 000000151521C240 ContentHash => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

Record known USN: Path E:\BuildXLInternalRepos\BuildXLInt1\Examples\HelloWorld\Out\Bin\file.out (id 0000000000000000-00110000002506D5 on D85EDD1F5EDCF76A) ->@ 000000151521C240 => A6AC609612A5447CB9C18341CB1CBACA9D62C4A0488BC2AB39D03A91621F8C8000

Also BuildXL.stat has the following entries:

// These two entries are for (id 0000000000000000-000F0000002506CF on D85EDD1F5EDCF76A) and
// (id 0000000000000000-00110000002506D5 on D85EDD1F5EDCF76A)
FileContentTable.NumUpdatedUsnEntriesByJournalScanning=2
narasamdya commented 2 years ago

In my case 2 small C++ projects are using boost source code as shared input and it constantly re-hashing and it takes considerable time.

It would be great if you can share your example so that I can try to repro it on my end.

vladimir-cheverdyuk-altium commented 2 years ago

Sorry I cannot share project, but it is easily reproducible with hello world project. And I would like to state that I was talking about input files, ffles that are not changed by build process.

I executed following command in helloworld directory:

"%BUILDXL_BIN%\bxl.exe" /c:config.dsc /server- /cacheDirectory:C:\Temp\Cache\1 /diag:Storage

then I renamed Out to Out.1, changed cache directory and executed these:

"%BUILDXL_BIN%\bxl.exe" /c:config.dsc /server- /cacheDirectory:C:\Temp\Cache\2 /diag:Storage

then I renamed Out to Out.2, changed cache directory to original and executed these:

"%BUILDXL_BIN%\bxl.exe" /c:config.dsc /server- /cacheDirectory:C:\Temp\Cache\1 /diag:Storage

After that Out.1\Logs\BuildXL.log from first command contains following lines:

[0:01.344] verbose DX0718: Miss (unknown file ID): Path C:\Projects\BuildXL\Test2\config.dsc (id 0000000000000000-000E000000194EEA on 8AC4B6BAC4B6A839) (no USN => content mapping known)
[0:01.362] verbose DX0716: Checkpoint (new USN): 0000000BD87EE568
[0:01.365] verbose DX0717: Record known USN: Path C:\Projects\BuildXL\Test2\config.dsc (id 0000000000000000-000E000000194EEA on 8AC4B6BAC4B6A839) ->@ 0000000BD87EE568 => C632B38FDCEB70179858A873A9489C4A911C98F60D02020E567E0EC8A28EFE4500

Everything fine, BuildXL caches it first time and registers USN 0000000BD87EE568 for config.dsc

Here is Out.2\Logs\BuildXL.log after second command:

[0:01.327] verbose DX0713: Read USN: (id 0000000000000000-000E000000194EEA) @ 0000000BD87EE568
[0:01.329] verbose DX0718: Miss (unknown file ID): Path C:\Projects\BuildXL\Test2\config.dsc (id 0000000000000000-000E000000194EEA on 8AC4B6BAC4B6A839) (no USN => content mapping known)
[0:01.346] verbose DX0716: Checkpoint (new USN): 0000000BD88202D0
[0:01.349] verbose DX0717: Record known USN: Path C:\Projects\BuildXL\Test2\config.dsc (id 0000000000000000-000E000000194EEA on 8AC4B6BAC4B6A839) ->@ 0000000BD88202D0 => C632B38FDCEB70179858A873A9489C4A911C98F60D02020E567E0EC8A28EFE4500

As you can see, it cannot find USN in cache and that is correct because it is empty. Then BuildXL registers new checkpoint and this will change USN of config.dsc to 0000000BD88202D0

And lastly, here is Out\Logs\BuildXL.log from last command:

[0:01.337] verbose DX0715: Miss (unknown USN): Path C:\Projects\BuildXL\Test2\config.dsc (id 0000000000000000-000E000000194EEA on 8AC4B6BAC4B6A839) @ 0000000BD88202D0 (known 0000000BD87EE568 => C632B38FDCEB70179858A873A9489C4A911C98F60D02020E567E0EC8A28EFE4500)
[0:01.355] verbose DX0716: Checkpoint (new USN): 0000000BD884D240
[0:01.357] verbose DX0932: Usn number is changed but content is unchanged: Path => C:\Projects\BuildXL\Test2\config.dsc OldUSN => 0000000BD87EE568 NewUSN => 0000000BD884D240 ContentHash => C632B38FDCEB70179858A873A9489C4A911C98F60D02020E567E0EC8A28EFE4500
[0:01.359] verbose DX0717: Record known USN: Path C:\Projects\BuildXL\Test2\config.dsc (id 0000000000000000-000E000000194EEA on 8AC4B6BAC4B6A839) ->@ 0000000BD884D240 => C632B38FDCEB70179858A873A9489C4A911C98F60D02020E567E0EC8A28EFE4500

As you can see, after I switched to original cache directory, BuildXL tries to find USN 0000000BD88202D0 but it is missing in cache because original cache has USN 0000000BD87EE568 for config.dsc. So BuildXL creates USN 0000000BD884D240 for this file. And if I switch cache directory to second one, it will not be valid again.

And as result, input file config.dsc was not changed but its USN changed 3 times because of this close record.

I cannot use server mode, because I would like to have 2 BuildXL be able to run in parallel.

narasamdya commented 2 years ago

FCT is updated by journal scanning after we build the graph. config.dsc is an input file used for building the graph, and thus we don't update FCT for such inputs.

If the content of such files don't change, I hope that in the last build you still get graph cache hit. I.e., BuildXL doesn't evaluate the .dsc spec to build the graph, but the graph is fetched from the cache. Let me know if that doesn't happen in your case.

vladimir-cheverdyuk-altium commented 2 years ago

Sorry, perhaps I don't understand you correctly, but you said that BuildXL does not update FCT for such inputs, but in my case it clearly updates FCT for config.dsc and logs above showing that BuildXL records and updates FCT.

My point that by recording USN, BuildXL changes USN and if more than one cache is used then file will be constantly re-hashing.

Just to prove it, I ran Process Monitor to check how many read operations happened. When config.dsc is in FCT cache then there is only 2 read operations (one to read 481 bytes and second looks like EOF read). Looks like it is BuildXL is reading and parsing its content (or perhaps something else is reading it). When config.dsc is not in FCT cache then there are 4 read operations because it has to be re-hashed.

So, if I run first command, there is 4 read operations. It is normal. When I run second command, there are still 4 read operations. And it also normal. But third command also does 4 read operations, even file is not changed. So even there is hash for config.dsc in first FCT, due to recording hash to second FCT it effectively invalidates FCT entry in first cache.

But, If I run third command again, without changing cache directory, then there will be only 2 read operations. And it is how it suppose to be.

This is obviously is not a big issue for hello world but when BuildXL has to re-hash whole boost (13K files) every single run, it obviously takes some time.

And I would like to state again, that I didn't change content of any files when I executed these commands . I only rename Out directory so I can see and compare log clearly.

narasamdya commented 2 years ago

In short BuildXL treats config.dsc in a different way than other .dsc files or other build input files.

Before going into that treatment, let me first mention about BuildXL's file change tracker (or tracker). Similar to file content table (FCT), the tracker tracks file IDs and their tracked USNs, but unlike FCT, it doesn't have content hash info. The tracker also has information about tracked directories and the fingerprints of their memberships. Importantly, the tracker has a USN checkpoint that tells BuildXL the starting point for journal scanning; once scanning is done, the checkpoint is moved forward.

BuildXL has two instances of tracker: one for tracking graph input files (the .dsc files), and the other for tracking build input files. The former tracker can be used to check up-to-dateness of the graph input files (without computing their content hashes). When those files are up-to-date, BuildXL can reuse the graph constructed in the previous build, which is stored in the object folder. The latter tracker is mainly used for pruning the build graph if we know that some pips are already up-to-date. For naming purpose, let's name the former tracker spec tracker, and the latter build tracker.

Now, for the 'special' treatment to config.dsc, here's roughly what happens when we run BuildXL

  1. BuildXL reads/parses/interprets config.dsc. When reading config.dsc the file content table (FCT) and the tracker have not been created/loaded, so the content hash or USN is not recorded anywhere. After config.dsc has been loaded, BuildXL can decided whether to create/load FCT. The spec tracker has not been loaded yet (may be we should).

  2. BuildXL tries to reuse previously built graph. Since constructing the build graph can take a lot of time if you have a big build (we have a customer with ~40,000 build specs), BuildXL tries to check if it can reuse the graph constructed in the previous build(s). It has two mechanisms for that. First, it checks if the graph stored in the object folder (the last recently constructed graph) can be reused. If failed, BuildXL then checks if any graph stored in the cache can be reused; this graph may have been constructed by a very old build. The latter check is more expensive than the former, but still cheaper than constructing the whole graph again (see CachedGraphProvider.cs). The spec tracker is loaded during the former check.

  3. BuildXL constructs the build graph if attempt to reuse failed. BuildXL reads/parses/interprets the .dsc files (other than config.dsc - it's been read before) to construct a build graph. On reading any .dsc file, BuildXL records its hashes into the FCT and tracks it in the spec tracker. Note that since config.dsc has been loaded before, it is never tracked by the spec tracker.

  4. BuildXL updates file content table using the build tracker (not spec tracker). Before BuildXL executes the pips in the build graph, it scans the journal from the checkpoint tracked by the build tracker, and updates file content table for files tracked by the build tracker if those files have USN changes, but those changes do not affect the contents. By scanning the journal, we know which files may have changed, and if those files are inputs to some pips, then we mark those pips dirty. We then only prune pips from the build graph if the pips are not dirty (or up-to-date).

I mentioned above that in steps (1) & (3) config.dsc is not recorded in FCT and is not tracked by tracker. But your log shows that config.dsc is recorded in FCT. The question is when does it get recorded in FCT? In step (2) both graph reuse checks require an initial graph fingerprint, and the content hash of config.dsc is included in the computation of that fingerprint. During that computation, because only FCT is available (the spec tracker has not been loaded yet), we use FCT to GetAndRecordContentHash config.dsc. That's why your log shows that config.dsc getting recorded in FCT.

The next question is when do we use the spec tracker? We use the spec tracker during the first graph reuse check, i.e., we check if the last recently constructed graph (stored in object folder) can be reused. At this point we want to check the up-to-dateness of the specs without computing their content hashes. To this end, we scan the journal from the checkpoint tracked by the spec tracker. By journal scanning, BuildXL knows which spec files, tracked by the spec tracker, have changed. For such files, BuildXL consult FCT to make sure if those change affect their contents; if they do, then we cannot use the graph. Also, by journal scanning, BuildXL also updates the FCT for spec files tracked by the spec tracker. Since config.dsc is never tracked by the spec tracker, then its record in FCT never gets updated. That's why your logs keep showing that config.dsc is recorded in FCT. I don't expect that to happen to other spec files, as well as non-spec files. But if it does, then please let us know.

Hope this helps.

vladimir-cheverdyuk-altium commented 2 years ago

Ok, I understand you explanation, and when I was playing with other files to get different logs for you, I found following.

But just to be on the same page, here is my setup. I went to BuildXL GitHub repo, downloaded code with tag Prod-buildxl.0.1.0-20210903.0, recompiled it with bxl.cmd -min After ran the same commands as above. And check last log. I found 666 entries "Usn number is changed but content is unchanged" in log of last command. As far as I see they are files from Out directory of BuildXL.

Here is one example. Log from Out.1:

[0:01.336] verbose DX0718: Miss (unknown file ID): Path C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll (id 0000000000000000-00050000001A2426 on 8AC4B6BAC4B6A839) (no USN => content mapping known)
[0:01.337] verbose DX0717: Record known USN: Path C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll (id 0000000000000000-00050000001A2426 on 8AC4B6BAC4B6A839) ->@ 0000000BE331DB48 => 4BAAF338C4C20381C2EFB322BA5420C51677E68EF7B2774304EB9850C7362E7C00

Log from Out.2:

[0:01.337] verbose DX0718: Miss (unknown file ID): Path C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll (id 0000000000000000-00050000001A2426 on 8AC4B6BAC4B6A839) (no USN => content mapping known)
[0:01.338] verbose DX0717: Record known USN: Path C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll (id 0000000000000000-00050000001A2426 on 8AC4B6BAC4B6A839) ->@ 0000000BE334C600 => 4BAAF338C4C20381C2EFB322BA5420C51677E68EF7B2774304EB9850C7362E7C00

Log from Out:

[0:01.380] verbose DX0715: Miss (unknown USN): Path C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll (id 0000000000000000-00050000001A2426 on 8AC4B6BAC4B6A839) @ 0000000BE334C600 (known 0000000BE331DB48 => 4BAAF338C4C20381C2EFB322BA5420C51677E68EF7B2774304EB9850C7362E7C00)
[0:01.381] verbose DX0932: Usn number is changed but content is unchanged: Path => C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll OldUSN => 0000000BE331DB48 NewUSN => 0000000BE337A318 ContentHash => 4BAAF338C4C20381C2EFB322BA5420C51677E68EF7B2774304EB9850C7362E7C00
narasamdya commented 2 years ago

Just to clarify so that I can repro your tests. So, you ran bxl.cmd -min 3x.

Did you change the cache directory, from cacheDir1 (for Out.1) to cacheDir2 (for Out.2), and back to cacheDir1 (for Out)? Did you use /server-?

vladimir-cheverdyuk-altium commented 2 years ago

Ok, here are steps.

  1. I downloaded source code zip from tag Prod-buildxl.0.1.0-20210903.0 and run bxl.cmd -min once
  2. I copied Hello World example to different directory C:\Temp\Test and make it current.
  3. I created/emptied C:\Temp\Cache
  4. Set BUILDXL_BIN environment variable to point to BuildXL-Prod-buildxl.0.1.0-20210903.0\Out\Bin\Debug\win-x64 directory that compiled on step # 1
  5. I ran "%BUILDXL_BIN%\bxl.exe" /c:config.dsc /server- /cacheDirectory:C:\Temp\Cache\1 /diag:Storage
  6. I renamed C:\Temp\Test\Out directory in to C:\Temp\Test\Out.1
  7. I ran "%BUILDXL_BIN%\bxl.exe" /c:config.dsc /server- /cacheDirectory:C:\Temp\Cache\2 /diag:Storage
  8. I renamed C:\Temp\Test\Out directory in to C:\Temp\Test\Out.2
  9. I ran "%BUILDXL_BIN%\bxl.exe" /c:config.dsc /server- /cacheDirectory:C:\Temp\Cache\1 /diag:Storage

So, C:\Temp\Test\Out.1 contains logs and output from step # 5, C:\Temp\Test\Out.2 logs and output from step # 7 and C:\Temp\Test\Out contains logs and output from step # 9

narasamdya commented 2 years ago

Thanks for the detailed steps.

The file that you showed in your log C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\Polly.Contrib.WaitAndRetry.dll (and possibly other files) is part of BuildXL deployment.

Recall that BuildXL tries to reuse previously constructed build graph. To that end, BuildXL calculates an initial fingerprint for the graph. I mentioned that the content hash of config.dsc is used as part of that fingerprint calculation. Another component used for the fingerprint calculation is build engine's ID. This ID is computed from the content hashes of files included in the BuildXL deployment. You can check the files included in this deployment in C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\[BuildXL.ServerDeployment.manifest | BuildXL.Deployment.manifest]

Also recall that when we calculate the initial fingerprint for the build graph, the file content table is already loaded and can be used to get/record content hashes. However, the file change tracker is not loaded yet. So, those files in the deployment are not tracked and their entries in the file content table cannot be updated in the next build.

For bits deployed from our release pipeline, we create BuildXL.manifest that contains git commit id from which the bits were produced. So, that BuildXL can use that commit id information as the build engine ID, instead of the content hashes of files in the deployment manifest.

If you look at your C:\Projects\BuildXL\Clean\Out\Bin\Debug\win-x64\BuildXL.manifest, you will find something like

BuildXL
Microsoft (R) Build Accelerator
0.1.0-devBuild
Developer Build
bxl.exe
bxlAnalyzer.exe

The text Developer Build is used as a replacement for the git commit id, and its presence indicates that BuildXL needs to use the content hashes of files in the deployment manifest as build engine ID.

BuildXL.manifest is generated by Public\Src\Branding\branding.dsc. Before you run bxl.cmd -min, you may want to set %[BuildXL.Branding]SourceIdentification% to a git commit id of downloaded BuildXL. You will then have BuildXL.manifest containing git commit id instead of Developer Build, and thus that git commit id will be used as build engine ID.

vladimir-cheverdyuk-altium commented 2 years ago

Thank you for detailed explanation. Now I understand it much better!

As for original issue, looks like I cannot reproduce it on clean build of BuildXL from tag Prod-buildxl.0.1.0-20210903.0. Everything works as you explained. Perhaps I messed it up somehow.

I really appreciate your explanation of all these details. Now I have much better understanding of how BuildXL works. I will close this issue.

narasamdya commented 2 years ago

You're welcome! :-)