rpm-software-management / rpm

The RPM package manager
http://rpm.org
Other
498 stars 359 forks source link

Drop support for external dependency generator #2373

Open pmatilai opened 1 year ago

pmatilai commented 1 year ago

The external dependency generator has been on life support for about 20 years now and we've been warning about it being deprecated for almost seven years now. Yet it's still being used by some, and continues to attract unwanted attention by simply being there.

I think it's time to finally pull the plug on that. The 4.19 plate is overflowing as it is, but 4.20 maybe.

Conan-Kudo commented 1 year ago

Last I checked, the only thing I know of still using this mechanism is the MinGW/Windows binary dependency generator. @rwmjones, did this get fixed sometime in the last few years?

pmatilai commented 1 year ago

IIRC mingw was one of the first ever to adopt the new-style internal generator, but there are bunch of others, including but not limited to using the filter-macros in Fedora. And then there's kernel ksyms stuff, only relevant for RHEL. And whatever other distros are doing in this space.

rwmjones commented 1 year ago

Can someone describe (for example) a simple regular expression to use on spec files to determine if they use this "external dependency generator"?

pmatilai commented 1 year ago

Look for _use_internal_dependency_generator (getting set to 0, that's the only thing it's ever used for) references in specs and macros. In Fedora, another certain indicator is %filter_setup which uses that internally.

Conan-Kudo commented 1 year ago

Also another hint is if you override %__find_requires, I think?

rwmjones commented 1 year ago

It seems like we removed it from OCaml packages around Fedora 20, and from libguestfs in Fedora 22. It is not used in any current package that I maintain.

pmatilai commented 1 year ago

Yeah, %__find_requires and %__find_provides overrides are telltale signs worth checking, but neither of those do nothing at all by themselves unless accompanied with _use_internal_dependency_generator set to 0. I've seen more than one case where these were simply forgotten behind.

prarit commented 1 year ago

This is a significant issue for Fedora and ELN kernel-ark builds. We have ~8000 modules that are examined for requires and provides. On an atypically large system (-j100, lots of memory, disk, etc.) the build time is ~20 minutes using the external dependencies specified in the kernel.spec. Moving to the internal dependency generator results in an ~20 minute INCREASE in build time.

The increase in build time appears to be that the dependencies and provides are evaluated in serial rather than in parallel. Is that something that can be resolved prior to deprecating support for the external dependency generator?

jmflinuxtx commented 1 year ago

The increase in kernel build time is a major concern, particularly as more flavors are being added. This is not some trivial 2-3% build time increase, but 30-100% depending on the machine doing the build.

pmatilai commented 1 year ago

Dependency generation never ran in parallel. The difference is that the "new" dependency generator of the last 20 years collects and records dependencies on per-file basis - by running any applicable generators once per each file rather than all at once on per-package basis. This tends to, uh, highlight the startup times of various interpreters.

Parallelizing that piece is a tough cookie and wont make the intrinsic slow-and-stupid go away, I've been looking at an opt-in mode to process multiple files at once (while still collecting the info per file) but this requires a (generally simple) change to generators to take advantage of.

dvlasenk commented 1 year ago

The startup costs are an issue when you need to start 8000 dependency generators.

"Old" dependency generator starts only one external script per package. Then, the script can be written to be smart and start 8000 subtasks in parallel. It's not possible with the "new" generator.

I cooked up an ugly work-around:

First, kernel.spec %install section generates "provides" for ksyms and modaliases, and saves it in some temporary files.

Then, kernel-rpm-macros are modified to read these files if they exist, otherwise to run the usual dependency generator script as a child. Along the lines of:

--- a/provided_ksyms.attr
+++ b/provided_ksyms.attr
@@ -1,2 +1,45 @@
-%__provided_ksyms_provides   /usr/lib/rpm/redhat/find-provides.ksyms
 %__provided_ksyms_path>        .*\.(ko|ko\.gz|ko\.bz2|ko\.xz|ko\.zst)$
+
+# Notes on Lua:
+# The backslash in strings (like "\n" newline) needs to be doubled
+# because we are inside rpm macro. Single backslashes before most chars
+# disappear (removed by rpm's parser), so "\n" turns into just "n".
+# In string.gsub patterns, unlike regexps, backslash has no special meaning.
+# It can't escape . and such. (Use one-character set [.] to represent
+# literal period, or lua's percent escape: %.)
+# Pipe (|) has no special meaning too.
+
+%__provided_ksyms_provides() %{lua:
+    function strip_compress_sfx(fn)
+        local cnt
+        fn, cnt = string.gsub(fn, "%.gz$", "")
+        if cnt == 1 then return fn; end
+        fn, cnt = string.gsub(fn, "%.bz2$", "")
+        if cnt == 1 then return fn; end
+        fn, cnt = string.gsub(fn, "%.xz$", "")
+        if cnt == 1 then return fn; end
+        fn, cnt = string.gsub(fn, "%.zst$", "")
+        return fn
+    end
+    local buildroot = rpm.expand("%buildroot")
+    local modname = rpm.expand("%1")
+    local nosuffix = strip_compress_sfx(modname)
+    local fn = buildroot..".provides"..nosuffix..".ksyms"
+    -- io.stderr:write("TESTING- ",fn,"\\n")
+    local f = io.open(fn)
+    if f then
+        -- io.stderr:write("Found!- ",fn,"\\n")
+        for l in f:lines() do
+            -- io.stderr:write("[KSYM:",l,"]\\n")
+            print(l.."\\n")
+        end
+        f:close()
+    else
+        -- there is no prepared result file: spawn external generator,
+        -- feed it the modname, and use its output
+        if not string.match(modname, "['%%]") then   -- ensuring there is no ' % to disrupt quoting / expansion
+            local r = rpm.expand("%(printf '%%s' '"..modname.."' | /usr/lib/rpm/redhat/find-provides.ksyms)")
+            print(r)
+        end
+    end
+}

This ... works, but are we ready to live with such atrocities? Look at what I have to do to pass module name on stdin of find-provides.ksyms...

prarit commented 1 year ago

@pmatilai @dvlasenk we certainly could do that an maintain it in the kernel, however, my concern is about the interface used to make this work. @pmatilai can you confirm that the interface will remain in place to do this? Otherwise, we'll have to figure something else out.

pmatilai commented 1 year ago

The old interface as it is now WILL go away, it's only a matter of when, not if.

Look, I know about the kernel case. As I said in the previous comment I'm working on a variant that allows processing multiple files in a go while preserving the per-file relation that is increasingly important.

dvlasenk commented 1 year ago

The fix which would definitely solve this problem is if rpmbuild would run dependency generators on all CPUs. These days, 8-thread build machines is bare minimum what people use. The hack we are working on basically implements this approach.

pmatilai commented 1 year ago

That parallelization argument is getting really old you know. You've repeated it many, many times and I've repeatedly told you it's not that easy, and it would not make it any less horribly inefficient, even if it managed to consume all 256 cpus at once.

I just submitted the multifile-dependency generation mode I hinted at in an earlier comment here: #2537

rwmacleod commented 3 months ago

FYI, Yocto/OE Linux sets the _use_internal_dependency_generator option to 0 in:

https://git.yoctoproject.org/poky/commit/?id=84f7f70308eed0ac96abeb5a762e9b7765e5db91

Tracked in YP: https://bugzilla.yoctoproject.org/show_bug.cgi?id=15521