thought-machine / please

High-performance extensible build system for reproducible multi-language builds.
https://please.build
Apache License 2.0
2.45k stars 203 forks source link

Please does not correctly rebuild when local file change in subrepo #2943

Open izissise opened 10 months ago

izissise commented 10 months ago

Example repo -> https://github.com/Wuageorg/please_issue_2943

command please build && bash plz-out/bin/dir1/this_subrepo/bin.sh

It will display

hello world1

as intended the first time, now modify dir1/script.sh as indicated and redo please build && bash plz-out/bin/dir1/this_subrepo/bin.sh it will still display

hello world1

full output

❯ please build && bash plz-out/bin/dir1/this_subrepo/bin.sh 
Build finished; total time 10ms, incrementality 100.0%. Outputs:
//dir1:dir1:
//dir1:subrepo:
hello world1
❯ # modify dir1/script.sh
❯ please build && bash plz-out/bin/dir1/this_subrepo/bin.sh 
Build finished; total time 10ms, incrementality 100.0%. Outputs:
//dir1:dir1:
//dir1:subrepo:
hello world1

the BUILD file https://github.com/Wuageorg/please_issue_2943/blob/main/dir1/BUILD does something akin to go-rule's go_repo https://github.com/please-build/go-rules/blob/master/build_defs/go.build_defs#L1150

Tatskaari commented 10 months ago

Yeah, I've noticed this. I think Please checks target outputs when building a target: https://github.com/thought-machine/please/blob/master/src/build/incrementality.go#L84

I don't think this picks this case up though, unfortunately.The path is a directory and the directory still exists. We'd have to walk the dir tree and check the hash sum against the expected hash sum for that dir or something. I think this could have a performance impact if the directory tree is large.

I've been saying that modifying the targets in the subrepos directory is a bad idea, and people should just not do that, but I've also had a few engineers internally have those files mysteriously modified. I think this can bite you in subtle and hard to debug ways, so maybe we should try and figure out if there's something we can do here.

izissise commented 10 months ago

It seems to me, that subrepos should check if the rule declared as dep needsBuilding

Tatskaari commented 10 months ago

I think it does, unless I'm mistaken. We queue the subrepo target up to be built just like normal, when we hit Parse() for the subrepo label.

I think I have an idea for what we can do here, that would bring local execution closer to how remote execution works. With remote execution, we're changing the way subrepo outputs are handled, which will actually resolve this issue. I think we might be able to do something similar with local execution.

With remote execution, we have two caches. One is content addressable, and the other is an action cache keyed by the action digest (which essentially equates to the same thing as the rule hash returned by plz query hash). We build up the rule hash which can be determined locally from repo state. From where we can fetch the action result from the action cache, which contains, amongst other things, a merkle tree representing the output directory of the action. We can use this to fetch the outputs from the content addressable storage (CAS).

We're aiming to take advantage of this for remote execution, so that we don't download subrepos to plz-out at all. We instead load things from the CAS (caching to disk for performance) directly, so files can't be modified locally (unless you go edit things under ~/.cache/please).

We could change the way the local directory cache works, so it's content addressable in the same way. We would have a target cache that maps the rule's hash to the output merkle tree. This merkle tree could would then load the BUILD file or .build_defs files from the CAS based on the recorded output hashes, so if the files changed in plz-out this would no longer affect parsing. In theory, we could use the same directory protos locally as we do remotely.

I think this is a fairly chunky bit of work but is probably worth doing. What do you think @peterebden ?

izissise commented 10 months ago

Retaking the example repo https://github.com/Wuageorg/please_issue_2943

the dependency graph is

$ please query deps //dir1
//dir1:dir1
  //dir1:subrepo
    ///dir1/this_subrepo//:build

Subrepos could implies dependencies on their declared dep. In this case this would mean the subrepo ///dir1/this_subrepo would add a dependency on //dir1:subr_srcs triggering a rebuild of the subrepo when script.sh changes since it is in srcs of //dir1:subr_srcs.

Hence the graph for the example would become

$ please query deps //dir1
//dir1:dir1
  //dir1:subrepo
    ///dir1/this_subrepo//:build
      //dir1:subr_srcs
        //dir1:subr_buildfile
izissise commented 10 months ago

Maybe the operation is not clear, I get this bug when modifying the source file dir1/script.sh and not the one inside the subrepo.

It does trigger a rebuild for //dir1:subr_srcs build_rule, but does not rebuild the subrepo tree. That's reason I believe there is a missing dependency