Open lap1nou opened 2 months ago
Very good job @lap1nou !
Nice catch ! Question: instead of deleting individual files, wouldn't it be better to simply wipe the content of the folder /tmp in the post install phase ? That way we'd know for sure there isn't any files left. I can't check at the moment.
Nice catch ! Question: instead of deleting individual files, wouldn't it be better to simply wipe the content of the folder /tmp in the post install phase ? That way we'd know for sure there isn't any files left. I can't check at the moment.
already the case
@lap1nou any idea why the files you removed on your PR were still here?
Greetings,
@ShutdownRepo yes I think it's a Docker specificity, while you are indeed removing them, this happens in another layer (because it's another RUN command in the Dockerfile, each command create another layer), but the file are still present in one of the layer so this take space in the final Docker image.
References:
I'm not a Docker expert so I might be wrong, altough the size of the image do indeed shrink with these changes.
Regards.
any idea why the files you removed on your PR were still here?
Indeed, Docker store every file of the layer at the end of each RUN commands, so removing them in the last layer is not enough.
Thanks @lap1nou for the PR, don't hesitates to submit more if you have time to dive into it ;)
Instead of merging this PR , I think a refactoring of the function post_install
should be considered as this function already takes care of wiping /tmp.
It means that the function post_install should :
It could also be a new dedicated function like clean_docker_layer
.
I agree @gbe,
However I have to say that removing pip
folder after they are written is less efficient than telling pip
to not write them in the first place, but a clean function would indeed be more clear / readable / maintenable.
Hey,
Here is some more space we could save:
GOFLAGS='-ldflags=-s'
before any go build / install
, this should save around 4 MB per binary (it would save more combined with the -w
LDFLAGS but I can't use it as the go
command is handled by catch_and_retry()
), this should represent something like 150 MB in totalbundler
cache with --no-cache
(or remove the cache folder from a dedicated function as suggested)npm prune --production
, this should remove all devDependencies (I think this should save around 200 MB)I need your feedback to know which of these optimisations I can apply to this PR, I already tested them a bit offline and it seems to work fine.
My full
image seems to be at 41 GB now but this may not be precise.
I had already suggested in a previous PR to remove the .git folders but @ShutdownRepo wants to keep them so users can pull the latest versions if needed.
Instead of merging this PR , I think a refactoring of the function
post_install
should be considered as this function already takes care of wiping /tmp.It means that the function post_install should :
* not be called at the latest layer, but at the end of each layer to avoid the oversize * include the work done in [Reduce image size - remove pip cache dir #341](https://github.com/ThePorgs/Exegol-images/pull/341)
It could also be a new dedicated function like
clean_docker_layer
.
agreed let's do that
Hey,
Here is some more space we could save:
* [ ] Use `GOFLAGS='-ldflags=-s'` before any `go build / install`, this should save around 4 MB per binary (it would save more combined with the `-w` LDFLAGS but I can't use it as the `go` command is handled by `catch_and_retry()`), this should represent something like 150 MB in total * [ ] Remove `bundler` cache with `--no-cache` (or remove the cache folder from a dedicated function as suggested) * [ ] Run `npm prune --production`, this should remove all devDependencies (I think this should save around 200 MB)
I need your feedback to know which of these optimisations I can apply to this PR, I already tested them a bit offline and it seems to work fine.
My
full
image seems to be at 41 GB now but this may not be precise.
Let's do 2 (bundler) and 3 (npm prune)
I had already suggested in a previous PR to remove the .git folders but @ShutdownRepo wants to keep them so users can pull the latest versions if needed.
Yes, imo we need to keep .git
. But I'd be curious to have a benchmark of how much space we could gain. Because if we were to save a lot of space, maybe I would be wiser to reconsider.
Anyways, thank you @gbe and @lap1nou for sharing your insight and investing your time in this thread!! Reducing the size of the images is important for the project and its users.
Good I'm going to apply all of this, the bundler thing could be applied in the clean layer function but not the npm one.
To answer your question @ShutdownRepo, .git
folder takes 1.52 GB on the full
image, here is the script used (remove the last result as we don't count the mounted resources .git
folder):
#!/bin/bash
# Find all .git folders and calculate their sizes
total_size=0
while IFS= read -r -d '' git_folder; do
git_size=$(du -b -s "$git_folder" | awk '{print $1}')
total_size=$(echo $total_size + $git_size | bc)
done < <(find / -type d -name ".git" -print0 2>/dev/null)
# Display the total size in human-readable format
echo "Total size of all .git folders: $total_size"
Maybe the .git
folders could be left exclusively in the nightly image for users willing to have cutting edge versions?
Description
Greetings,
This PR aims to reduce the size of the image (by around 775 MB) by removing files that aren't used and that take place in the layers of the final image.
Image size before: ~25,34 GB, after: ~24,374 GB, I used
dive
(https://github.com/wagoodman/dive) to know which file was wasting space.I also have other optimisations that could reduce the size of the image, I can create another PR if this one is merged.
Regards.
Related issues
N/A
Point of attention
I only ran
dive
on theweb
image, I will probably use it on thefull
image if you find this PR interesting in order to find more optimizations.