Open eseidel opened 1 year ago
We should, yes. But we'll do this later as part of optimizing our costs.
I looked into this briefly (for iOS) today. I had forgotten that we use ELF as a format for patch distribution, even on iOS. I believe that's the main reason the patches end up larger on iOS is we're diffing an ELF file against a MachO file. We could get around this by instead using our own custom format, which just contained the 4 dart blobs and did not bother to package in elf since we don't need any of the features of elf, just the 4 raw blobs which we load read-only anyways.
I also looked at reducing the LinkTable size and found that moving from 64bit ints to 32bit ints had less savings than I was expecting?
64bit ints: LinkTable size: 196608 bytes
32bit ints: LinkTable size: 131072 bytes
Moving to LEB128: LinkTable size: 131072 bytes
I guess it's padding that's making these so even?
LEB128 code:
diff --git a/pkg/aot_tools/lib/src/linker/linker.dart b/pkg/aot_tools/lib/src/linker/linker.dart
index 279d461bc2c..2307bc0fa3b 100644
--- a/pkg/aot_tools/lib/src/linker/linker.dart
+++ b/pkg/aot_tools/lib/src/linker/linker.dart
@@ -1,5 +1,3 @@
-import 'dart:typed_data';
-
import 'package:aot_tools/src/linker/link_reporter.dart';
import 'package:aot_tools/src/linker/shorebird_linker_overrides.dart';
import 'package:aot_tools/src/snapshot_analysis.dart';
@@ -12,10 +10,51 @@ class ByteWriter {
final bytes = <int>[];
/// Write a 64-bit integer in little endian to the byte stream.
- void addInt64(int value) {
- /// Arm64 is little endian.
- final byteData = ByteData(8)..setInt64(0, value, Endian.little);
- bytes.addAll(byteData.buffer.asUint8List());
+ // TODO(eseidel): Use 32-bit integers or LEB-128 instead to save space.
+ // void addInt64(int value) {
+ // assert(value >= 0 && value <= 0xFFFFFFFFFFFFFFFF, 'value: $value');
+
+ // /// Arm64 is little endian.
+ // final byteData = ByteData(8)..setInt64(0, value, Endian.little);
+ // bytes.addAll(byteData.buffer.asUint8List());
+ // }
+
+ void _writeByte(int value) {
+ assert(value >= 0 && value <= 0xFF, value);
+ bytes.add(value);
+ }
+
+ /// Write one ULEB128 number.
+ void addInt64(int unsignedInt) {
+ // First bit in every byte is a signal if there is more data.
+ // So, we split the number into 7-bit parts and write them smaller to larger,
+ // prefixing with the signal.
+ var value = unsignedInt;
+ assert(value >= 0 && value <= 0xFFFFFFFFFFFFFFFF, 'value: $value');
+ var bytes = 0;
+ for (;;) {
+ var part = 0x7F & value;
+ value >>= 7;
+
+ if (value != 0) {
+ // Signal that there is more data.
+ part |= 0x80;
+ }
+ bytes++;
+ _writeByte(part);
+
+ if (value == 0) {
+ break;
+ }
+
+ if (value == -1) {
+ for (var i = 0; i < 9 - bytes; i++) {
+ _writeByte(0xFF);
+ }
+ _writeByte(1);
+ break;
+ }
+ }
}
}
I think the next thing to test here is what the order of symbols between macho and elf are. If they're not in the same order, that's the easy thing to fix.
We could trivially make our own non-elf replacement format for patches, we would just want to make sure that the "symbol order" in the package matched the order in mach-o for maximum similarity?
macho order:
0000000000004080 T _kDartVmSnapshotInstructions
0000000000011280 T _kDartIsolateSnapshotInstructions
00000000000c4880 S _kDartVmSnapshotData
00000000000cd8c0 S _kDartIsolateSnapshotData
00000000001a4000 b _kDartVmSnapshotBss
00000000001a4018 b _kDartIsolateSnapshotBss
elf order:
00000000000001c8 R _kDartSnapshotBuildId
0000000000000200 R _kDartVmSnapshotData
0000000000009240 R _kDartIsolateSnapshotData
00000000000e0000 T _kDartVmSnapshotInstructions
00000000000ed200 T _kDartIsolateSnapshotInstructions
Yeah, so it looks like mach-o puts instructions before data and elf the opposite way around. That's presumably why the diffs end up extra large. I'm not sure how well zstd handles reordering sections within binary files.
Another test we could do is to write a tiny tool that read in a mach-o snapshot and wrote out an elf snapshot and then ran patch
against the two to see.
Currently patch sizes are often 10s of kb. Which is plenty small for many applications. However we're using a binary diffing algorithm which knows nothing about what it's diffing and could probably be another 10x smaller if it did.