xd009642 / tarpaulin

A code coverage tool for Rust projects
https://crates.io/crates/cargo-tarpaulin
Apache License 2.0
2.5k stars 180 forks source link

huge file size in tarpaulin builds #516

Closed xMAC94x closed 2 years ago

xMAC94x commented 4 years ago

Hey devs, We did some tests in our CI, where we are doing `cargo check, build, test, bench, doc, release builds, and tarpaulin. We stored their caches in different directories:

$tree -L 2
.
|-- cache-all
|   |-- debug
|   `-- release
|-- cache-release-linux
|   `-- release
|-- cache-release-macos
|   |-- release
|   `-- x86_64-apple-darwin
|-- cache-release-windows
|   |-- release
|   `-- x86_64-pc-windows-gnu
|-- cache-tarpaulin
|   `-- debug

with the following sizes

$ du -sh *
5.3G    cache-all
1.3G    cache-release-linux
1.2G    cache-release-macos
1.2G    cache-release-windows
11G cache-tarpaulin

we noticed that tarpaulin allone uses 11GB, while all other builds, including check, build, test, bench, doc only take 5GB of space. And in this issue we wanted to ask if there is potential point of improvement. We know that tarpaulin ofc needs to add some information to the binaries in order to check the coverage, but lets go on:

$ cache-tarpaulin/debug# du -sh *
500M    build
5.7G    deps
4.0K    examples
4.1G    incremental

we see 6 GB of dependencies and 4 GB of incremental builds

$cache-tarpaulin/debug/deps# ls -Shl
total 5.7G
-rwxr-xr-x 2 root root  585M Jul  7 12:28 veloren_voxygen-a83d23850890f9ee
-rw-r--r-- 1 root root  219M Jul  7 12:27 libveloren_voxygen-99fa318ed0535ffc.rlib
-rw-r--r-- 1 root root  212M Jul  7 12:27 libveloren_voxygen-c940f6851bc3d210.rlib
-rwxr-xr-x 1 root root  187M Jul  7 12:28 veloren_voxygen-3329630e0989a9f9
-rwxr-xr-x 1 root root  145M Jul  7 12:26 veloren_client-723270e98d6d62c5
-rwxr-xr-x 1 root root  143M Jul  7 12:27 veloren_server-0549e449137df30c
-rw-r--r-- 1 root root  137M Jul  7 12:25 libveloren_server-ab80dc9c64adfe14.rlib
-rw-r--r-- 1 root root  132M Jul  7 12:25 libveloren_server-2a0d58b13bb14580.rlib
-rwxr-xr-x 1 root root  119M Jul  7 12:26 veloren_world-4cf4a3e00d663e42
-rw-r--r-- 1 root root  103M Jul  7 12:23 libgtk-5f86ecb13cac07aa.rlib
-rw-r--r-- 1 root root  102M Jul  7 12:24 libgtk-5f12019f6217edef.rlib
-rwxr-xr-x 1 root root   90M Jul  7 12:24 integration-4439d9a92a137e7b
-rw-r--r-- 1 root root   89M Jul  7 12:24 libveloren_common-b328f57ca788792e.rlib
-rw-r--r-- 1 root root   86M Jul  7 12:25 libveloren_common-6115d7ac45ca4614.rlib
-rw-r--r-- 1 root root   72M Jul  7 12:26 libveloren_world-98f55c44726caace.rlib
-rw-r--r-- 1 root root   71M Jul  7 12:26 libveloren_world-1031254064696422.rlib
-rwxr-xr-x 1 root root   51M Jul  7 12:23 closing-74f274e9c3d629bb
-rwxr-xr-x 1 root root   44M Jul  7 12:25 veloren_common-3b8b497e3122ba3d
-rwxr-xr-x 1 root root   44M Jul  7 12:21 libdiesel_derives-acf9b65d516c16df.so
...

these big files in total are about 2581MB from 5.7 GB total.

I am wondering 2 things here:

lets compare witht he normal cargo build sizes:

$ cache-all/debug/deps# ls -Shl
total 1.8G
-rw-r--r-- 1 root root   32M Jul  7 12:02 libveloren_voxygen-99fa318ed0535ffc.rlib
-rw-r--r-- 2 root root   29M Jul  7 12:02 libveloren_voxygen-c940f6851bc3d210.rlib
-rwxr-xr-x 2 root root   27M Jul  7 12:03 veloren_voxygen-a83d23850890f9ee
-rw-r--r-- 1 root root   26M Jul  7 12:01 libgtk-5f12019f6217edef.rlib
-rw-r--r-- 1 root root   25M Jul  7 12:01 libveloren_common-b328f57ca788792e.rlib
-rw-r--r-- 1 root root   25M Jul  7 12:01 libgtk-5f86ecb13cac07aa.rlib
...
-rw-r--r-- 1 root root   18M Jul  7 12:00 libgtk-5f12019f6217edef.rmeta
-rw-r--r-- 1 root root   18M Jul  7 12:00 libgtk-5f86ecb13cac07aa.rmeta
...
-rw-r--r-- 1 root root   14M Jul  7 11:56 libdiesel-ad6c877cc5d19df1.rlib

if we don't cover those libs, couldn'd we use the smaller file size version of them?

also when inspecting the cache-tarpaulin/debug/incremental directory i noticed that it has alot of duplicate folders: e.g.

26M veloren_voxygen-11dxozsyonp5x
301M    veloren_voxygen-1kgnv3waswtoz
40M veloren_voxygen-1ofw4sg2egpcz
473M    veloren_voxygen-2pndicdynx6t0
464M    veloren_voxygen-3cb6v82afvb1k
338M    veloren_world-35lnid4gbvwhd
302M    veloren_world-3p6wlvl9al6t0
277M    veloren_world-z5wvp4cjry4b

Some finishing words: Ofc we are aware that you are prob not the cargo guys and don't know all the internals. However whe are creating 20GB docker images in order to provide our runners with caches. And just wanted to ask for you to seek some optimizing potential, reducing it to like 16GB would already be a great win :) With that said: Have a nice day :)

xd009642 commented 4 years ago

So tarpaulin uses link-dead-code as a linker option because otherwise it doesn't detect unused functions in your own code. I don't think there's a way to specify a linker flag and not have it propagate to all the dependencies so this would mean all the unused parts of your dependencies can be linked in as well. I'd guess this is what creates the bloat.

Alternatively, I could look at removing the flag and using the source-analysis part of tarpaulin to identify every line you can hit but that's a significant amount of work and might not be robust against things like macros. I've been wondering for a while if that was a worthwhile route to take so I'll start to look towards prototyping something for it and let you know if I make some progress :+1:

Also side note you don't need RUSTFLAGS="--cfg procmacro2_semver_exempt" for tarpaulin since proc macros reached stable so you can remove that unless it's needed for your own code to build.

xMAC94x commented 4 years ago

Hi @xd009642 thank you for your answer, it helps me alot understanding what is going on (: I am looking forward to hear from you how prototyping works out. And thanks for the tip regarding the RUSTFLAGS ;) Have a nice evening

xd009642 commented 2 years ago

So I'm going to close this as inactive, but I am watching some RFCs on crate specific RUST_FLAGS which will dramatically reduce file sizes! Unfortunately my other experiments in this area didn't work