Open gasinvein opened 4 years ago
It might be possible to make shallow clones of submodules with --shallow-submodules
. ( https://www.git-scm.com/docs/git-clone#Documentation/git-clone.txt---no-shallow-submodules )
It could be an upstream enhancement (relevant lines):
Though I am new to flatpak.
flatpak-builder should make shallow-clones by default when possible (there is an option to disable it explicitly). Do you think it might be not using it for submodules?
I think it isn't but I a not familiar with the codebase and the shallow submodules are an extra setting from --depth
main repo shallow fetching.
As far a I understand, flatpak-builder doesn't clone git repos with submodules, but instead extracts submodules list from the main repo and mirrors each submodule individualy. So any recursion options should be irrelevant in this case. Yet I can be wrong.
@barthalion My local tests suggest that running flatpak-builder
with --disable-updates
almost completely removes the issue. I'm guessing the flathub's build bot doesn't use this option? If so - maybe it should use it, given that it downloads sources prior to starting the build?
Not really sure. Sources are pre-downloaded, but build machines have also local cache – what happens if requested commit is not available in the local clone?
I'm not sure what the local cache is in this context. Aren't sources are downloaded anew on each build? If so, how requested commit could be unavailable?
I mean, if we run something like flatpak-builder --download-only
, and then flatpak-builder --disable-updates
, everything should be in place?
I've looked at this again and yes, sources are being downloaded as a separate step but on a mirror node, not runners. So passing --disable-updates
will just cause f-b to fail due to missing source code on actual builders.
This is getting worse over time as new components are being added to Proton (increasing the modules number in this flatpak).
Basically we do git fetch m*s
times, where m
is the number of flatpak-builder modules and s
is the number of git submodules in the source repo, so each addition to either increases build times significantly.
@barthalion Can we run f-b --download-only
followed by f-b --disable-updates
on runners as the build step?
I know we talked about it, but I still fail to understand what exactly --download-only
source would solve here. We no longer have sources
worker, and so the only "build command" that is executed is this:
command = ['flatpak-builder', '-v', '--force-clean', '--sandbox', '--delete-build-dirs',
'--user', fb_deps_args,
util.Property('extra_fb_args'),
'--mirror-screenshots-url=https://dl.flathub.org/repo/screenshots', '--repo', 'repo',
util.Interpolate('--extra-sources=%(prop:builddir)s/../downloads'),
'--default-branch', util.Property('flathub_default_branch'),
'--subject', util.Property('flathub_subject'),
'--add-tag=upstream-maintained' if builds.is_upstream_maintained(id) else '--remove-tag=upstream-maintained',
'builddir', util.Interpolate('%(prop:flathub_manifest)s')]
How is --download-only
in a separate step going to help?
--download-only
by itself isn't going to help, it's --disable-updates
what makes difference here. If the build is ran with --disable-updates
, flatpak-builder skips fetching git sources from remotes and just copies whatever is already cached.
But it's still going to take a significant amount of time to execute --download-only, doesn't it?
Yeah, just re-checked that and it seems like it. So, my proposal to run --download-only
followed by --disable-updates
probably doesn't make sense.
But still, maybe we could run builds with --disable-updates
on Flathub? If it's not an option to enable it for all builds, maybe it could be gated by some flathub.json
option?
Proton is a git repository with numerous git modules, some of which are huge (namely wine and gstreamer). And we have many flatpak-builder modules (two for each Proton component).
flatpak-builder fetches the whole Proton repo with all git submodules for each module, what results in heavy I/O and incredibly long download/checkout times (in fact, on Flathub checkouts take even more time than actual compilation).
We should do something about it. The only solution I see is splitting single git source into multiple archive sources. Does anyone has other ideas?