Closed lrvick closed 5 years ago
@lrvick thanks for reporting. Unfortunately I didn't secure any free time yet to properly look the Pixel 3 support (initial support was contributed by other folks). Hopefully, will catch up with the backlog these days.
So based on your output the diff in the framework Jar files is located in the Dex files. They appear to be of identical size, although they differ. Interesting, will investigate further.
For the toolbox symlink issue I'm not sure I understood the anomaly. Do you mean that when you run the scripts sometimes the links are generated against "/vendor/bin/toolbox" and some against "toolbox"? Or that some links are against the former and some against the latter?
For the toolbox symlink issue I'm not sure I understood the anomaly. Do you mean that when you run the scripts sometimes the links are generated against "/vendor/bin/toolbox" and some against "toolbox"? Or that some links are against the former and some against the latter?
Yeah this diff is from two back to back builds with the only difference being they output to different folders. One of them gets the symlinks set one way, and the other gets them set the other.
In addition to breaking determinisim, I suspect the symlinks only work half the time too ^_^
Ok that's a new one. Will investigate since I have no idea why this might happen.
Btw are you running on Linux or macOS?
Building inside this container on Debian hosts: https://github.com/hashbang/os/blob/master/Dockerfile
Regarding the Jar files hash mismatch, its due to the timestamp of the generated files.
--- /dev/fd/63 2018-12-24 10:43:47.712139520 +0200
+++ /dev/fd/62 2018-12-24 10:43:47.712139520 +0200
@@ -1,5 +1,5 @@
-Archive: 1.jar
+Archive: 2.jar
Zip file size: 34772 bytes, number of entries: 2
-rw---- 1.0 fat 45 bx stor 08-Jan-01 00:00 META-INF/MANIFEST.MF
--rw---- 2.0 fat 108208 bl defN 18-Dec-24 10:20 classes.dex
+-rw---- 2.0 fat 108208 bl defN 18-Dec-24 10:25 classes.dex
2 files, 108253 bytes uncompressed, 34512 bytes compressed: 68.1%
When the tool runs the entire workspace is cleaned. Therefore, the Dex files are re-extracted from factory image and thus obtain a new timestamp. Then when appended back to the Jar files the zip entries have a different timestamp, despite the Dex files being identical.
I can't think of a straight forward way to deal with this enhancement you've requested., However, since I like the idea of reproducible builds, one workaround that might work is to expose an additional argument with which someone can define a reference timestamp. Then this timestamp is used for all the new files generated by the tool. Unfortunately, this might break the rsync functions (depends on the usage) when someone sets the output directory in AOSP root. Therefore, I cannot enable it by default.
Any further recommendations are more than welcome.
Btw I wasn't able to reproduce the symlink issue yet.
So the AOSP build system will normalize timestamps globally to either right now, or whatever the value of $BUILD_DATETIME (epoch) is, if it is set.
Anyone who is doing reproducible builds will have BUILD_DATETIME set, so I would suggest respecting that and builds should behave as expected.
BUILD_DATETIME
controls timestamps of AOSP generated files and will not make any difference in the case we're examining here. We're talking about timestamps of Zip entries inside a Zip archive. No matter how many timestamp changes you do in the Zip file, its entries will not get updated from AOSP (even if resigned).
To update the timestamp of the Zip entries it needs some manual work. Since I don't want to integrate any python (or similar) scripts, the only way I know of for this to be accomplished via bash, is to manually set the timestamp of a file and then update the corresponding zip entry. To do that though the script requires a reference timestamp. I cannot make it dynamic based on BUILD_DATETIME
since the files will be already generated when AOSP kicks-in. Maybe having some shell automation inside the makefiles will make it possible, although I don't like this road since its fragile.
The best I can think of for your request is that you invoke the tool with a timestamp argument which should match the one you set in AOSP. I cannot handle it automatically since I want to maintain a portable solution. But from your side you can plug-in any changes or scripts you might need for your purposes.
In my case my build system does have BUILD_DATETIME available when vendor is built, which happens before the rest of AOSP thus extract-all.sh will see that var.
I am happy to call extract-all.sh with a "--timestamp $BUILD_DATETIME" if that is how you prefer to implement it though. No biggie.
@lrvick sounds good. On it.
@lrvick can you give it a spin? I think it should cover your request.
$ ./execute-all.sh -d crosshatch -b pq1a.181205.006 -o $(pwd) -i crosshatch/pq1a.181205.006/crosshatch-pq1a.181205.006-factory-96b23504.zip --timestamp 1545897471
$ git log HEAD -n1
4cd75bb - U Anestis Bechtsoudis - (HEAD -> master, origin/master, origin/HEAD) Option to timestamp the generated bytecode files (15 hours ago)
$ ./execute-all.sh --debugfs --yes --device crosshatch --buildID pq1a.181205.006 --output out1 --timestamp 1543792453
...
$ ./execute-all.sh --debugfs --yes --device crosshatch --buildID pq1a.181205.006 --output out2 --timestamp 1543792453
...
$ diffoscope --exclude-directory-metadata out1/crosshatch/pq1a.181205.006/vendor/google_devices out2/crosshatch/pq1a.181205.006/vendor/google_devices
No diff!
For reasons I don't at all understand, this seems to have cleared up the symlink issues as well. Closing this :)
Weird. So -actually- the output in the out directory when execute-all is done running things -are- identical.
When the vendor directory is ingested and placed into target_files dirs for release, the symlink issue manifests:
diffoscope target1/VENDOR target2/VENDOR
--- target1/VENDOR
+++ target2/VENDOR
├── bin
│ ├── dd
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
│ ├── egrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/grep
│ │ +destination: grep
│ ├── fgrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/grep
│ │ +destination: grep
│ ├── getevent
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
│ ├── getprop
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
│ ├── newfs_msdos
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
This is the same diff I find in final binaries as well.
I don't know Android well but I feel like the multiple references to these paths create some sort of race condition, or some issue that manifests only on the -second- build.
@lrvick Ok now I understood what you mean regarding the symbolic links. AOSP is already generating the toybox & toolbox items. Therefore, when created by the scripts they are redundant and are overwritten by AOSP anyways. I'm not sure if the double definition is the root cause of the different output across builds. At any case, with commit https://github.com/anestisb/android-prepare-vendor/commit/94bde98d9ed1d974310ced411cf56c894195cdfe toybox & toolbox are not longer processed, so its left to AOSP to handle them. If you see different outputs that is an AOSP issue.
Be sure to make clean builds before comparing two outputs.
Okay 3 diff back to back builds, with three different symlink outcomes. This screams race condition to me, but I don't know enough internals to track it down just yet.
$ diffoscope target1/VENDOR target2/VENDOR
--- target1/VENDOR
+++ target2/VENDOR
├── bin
│ ├── dd
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
│ ├── egrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/grep
│ │ +destination: grep
│ ├── fgrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/grep
│ │ +destination: grep
│ ├── getevent
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
│ ├── getprop
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
│ ├── newfs_msdos
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/toolbox
│ │ +destination: toolbox
$ diffoscope target2/VENDOR target3/VENDOR
--- target2/VENDOR
+++ target3/VENDOR
├── bin
│ ├── dd
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: toolbox
│ │ +destination: /vendor/bin/toolbox
│ ├── getevent
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: toolbox
│ │ +destination: /vendor/bin/toolbox
│ ├── getprop
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: toolbox
│ │ +destination: /vendor/bin/toolbox
│ ├── newfs_msdos
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: toolbox
│ │ +destination: /vendor/bin/toolbox
$ diffoscope target1/VENDOR target3/VENDOR
--- target1/VENDOR
+++ target3/VENDOR
├── bin
│ ├── egrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/grep
│ │ +destination: grep
│ ├── fgrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: /vendor/bin/grep
│ │ +destination: grep
Ok let me put a fresh set of builds to check if I can reproduce for crosshatch.
Those builds did -not- include 94bde98. Didn't see that in time. Trying again with it.
Did 2 builds with 94bde98 and now only the grep symlink issue is left:
--- target1/VENDOR
+++ target2/VENDOR
├── bin
│ ├── egrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: grep
│ │ +destination: /vendor/bin/grep
│ ├── fgrep
│ │┄ symlink
│ │ @@ -1 +1 @@
│ │ -destination: grep
│ │ +destination: /vendor/bin/grep
@lrvick I think that you don't do a clean build for the second try. As a result the target module from AOSP is considered built, although the redundant vendor links generated by this tool are always executed thus overriding the previous entry. It's not a race condition. I've verified with 2 clean builds and the output is identical.
At any case, it was a good observation since I removed some components already available in AOSP. https://github.com/anestisb/android-prepare-vendor/commit/e085541ee77fcf35f6ba268bb2b506046b07388a should remove the grep bin (and the egrep/fgrep links) too.
Clean build or not, a rebuild should always get the same output, not alter symlink paths. Three builds in a row with different grep symlink outcomes too before e085541.
In any event, as of e085541 I can no longer reproduce, so all good here. Thanks for chasing this!
So comparing 2 output directories for crosshatch with diffoscope we get:
If we look at the generated vendor files we find that the symlink paths for toolbox are not consistent. These are defined in both Android.mk and proprietary/device-vendor.mk. Possible race condition?
The above seems to be the last thing standing in the way of my goal to reach a fully deterministic pie build that will be easy for others to reproduce and verify.
Even if you can only point me in the right direction for solving this it would be appreciated :)