freebsd / poudriere

Port/Package build and test system
https://github.com/freebsd/poudriere/wiki
BSD 2-Clause "Simplified" License
379 stars 161 forks source link

Try to fix `poudriere options` when specifying ports with flavor #1087

Closed madpilot78 closed 8 months ago

madpilot78 commented 8 months ago

As reported in bug #1083 poudriere options is giving errors if it is passed ports specifying a flavor.

The options.sh script fails to populate the P_PORTS_FEATURES variable.

I have created this patch that starts a full jail so the port_var_fetch function can work and populate it.

This function is called via the listed_ports function call.

I'm not sure this is the correct approach, but is working fine here.

vanvoljg commented 8 months ago

This works to fix the issue. Thank you!

It does, however, have trouble if you happen to ^c during options dialog config. Doesn't clean up the jails...

madpilot78 commented 8 months ago

It does, however, have trouble if you happen to ^c during options dialog config. Doesn't clean up the jails...

Oh, I see, did not think about this. I will need a little time to figure this one out.

madpilot78 commented 8 months ago

I added a call to jail_stop to the CLEANUP_HOOK and this does fix the ^C issue, although, again, I'm not sure this is the correct way to achieve it. I see no other script doing this.

I'm open to suggestions for improvements though.

vanvoljg commented 8 months ago

Oh, that looks like a good idea to me. I had wondered how other scripts accomplished cleanup tasks when interrupted, so this is great.

It might be a decent idea to comment why we need to start a jail, because, as you say, it's not a pattern that seems to exist elsewhere. It's not obvious that listed_ports ends up calling a function for each port which checks if the ports tree has a feature, nor that, when not run in a jail, the check fails to produce the correct results because of another (also difficult to find) variable that's not correctly populated for reasons I don't understand.

Maybe something along the lines of this?

# The call, "listed_ports show_moved", checks for the existence of necessary
# ports features ("have_ports_feature FLAVORS"), which fails when not run from
# within a jail where global port vars have been fetched.
# Thus, we must set up and run a jail to set up the required environment.

I'm not sure if this is correct or helpful, but this is what sense I've been able to make of what's going on. I'm hoping it will at least be good enough for now.

madpilot78 commented 8 months ago

@vanvoljg I agree adding some comments would be useful, I'll add a commit soon with short comments explaining my changes.

BTW how a shell script can run code to cleanup when interrupted itself is easily explained: the use the trap shell builtin. But, being poudriere a complex script it has its own architecture of trap functions also extensible via the CLEANUP_HOOK variable, which I am not sure I understand the inner workings of.

madpilot78 commented 8 months ago

I'm starting to make sense of it. In theory the exit_handler, bound to run on exit via trap, calls jail_cleanup here. But before doing that, it checks the STATUS variable.

Such variable should be set in jail_start here, depending on the SET_STATUS_ON_START variable (which I don't think I'm setting, so the default of 1 should be used).

It looks like this is not happening, which would explain why the jail is not correctly being cleaned up.

SO correct solution would be to make this already existing mechanism work as expected, and not adding things to options_cleanup. I need to understand why STATUS is not being set properly (if that is really the issue)

vanvoljg commented 8 months ago

This looks like it's a pretty good PR to me. I really appreciate the comment, thank you very much! :slightly_smiling_face:

I wonder if the variable problem might end up being another PR. I'm just wondering about whether that would be needed, unless the effects of not setting STATUS are isolated to this specific issue (#1083). Anecdotally (and without sources at hand, unfortunately, so I'm being kind of bad, here) I've seen reports "around" that jails sometimes don't get cleaned up, but I don't recall seeing any identified root cause at the time I read those reports. Would this variable relate to that, if it's still a thing?

madpilot78 commented 8 months ago

@vanvoljg

My analysis in comment https://github.com/freebsd/poudriere/issues/1087#issuecomment-1807275749 was incomplete. STATUS is correctly set. But, in jail_cleanup, at:

https://github.com/freebsd/poudriere/blob/234f8d15a4ca61b71e2a55364cca695906fe58b0/src/share/poudriere/common.sh#L3488

The code also checks this was_a_jail_run function output (find it at the start of common.sh). So I fixed that and everything now works as expected.

Being poudriere a shell script i don't think it is possible to clear really all cases. A ^C hit at the wrong time could end up interrupting the script in the wrong moment while things are being setup (non atomically) for automatic cleanup.

Major refactoring of this code would be a very complex task that could cause more issues than it resolves, and I'm not an expert of poudriere code structure, so I would not feel confident taking on such a task. I am more in favour of an incremental strategy to improving it, at least for now.

Also, if ^C is hit two times in a row, the celanup code will be interrupted.

vanvoljg commented 8 months ago

Sweet, that's really excellent info, thanks! Now I know more about both poudriere and scripts; cheers!

bdrewery commented 7 months ago

I haven't looked at this in detail but had this commit pending to push. I think having a jail running now is better anyway. Note I did add fetch_global_ports_vars in a few places that still may be needed. There is a comment justifying the jail for this purpose but a jail is not needed for this purpose. It makes sense from a sandbox point of view though.

commit aac72a85ca538bd790b7294424b219c3e5490b8e
Author: Bryan Drewery <bryan@shatow.net>
Date:   Wed Nov 30 20:13:06 2022 -0800

    Fix some fallout from removing DEPENDS_ARGS.

    The have_ports_feature() was broken before and did not properly
    have FLAVORS set where listed_ports() was used.

diff --git src/share/poudriere/distclean.sh src/share/poudriere/distclean.sh
index f8ae54fb1..841df6879 100755
--- src/share/poudriere/distclean.sh
+++ src/share/poudriere/distclean.sh
@@ -154,6 +154,8 @@ for PTNAME in ${PTNAMES}; do
                export PACKAGE_BUILDING=yes
                echo "PACKAGE_BUILDING_FLAVORS=yes"
        fi >> "${__MAKE_CONF}"
+       unset P_PORTS_FEATURES
+       fetch_global_port_vars

        MASTERMNT= MASTERMNTREL= load_moved
        msg "Gathering all expected distfiles for ports tree '${PTNAME}'"
diff --git src/share/poudriere/foreachport.sh src/share/poudriere/foreachport.sh
index 993f3fec6..e1e831d9e 100755
--- src/share/poudriere/foreachport.sh
+++ src/share/poudriere/foreachport.sh
@@ -149,6 +149,7 @@ run_hook foreachport start
 exec >&3

 export PORTSDIR
+fetch_global_port_vars
 clear_dep_fatal_error
 parallel_start
 for originspec in $(listed_ports show_moved); do
diff --git src/share/poudriere/options.sh src/share/poudriere/options.sh
index 745c20d80..9c933937d 100755
--- src/share/poudriere/options.sh
+++ src/share/poudriere/options.sh
@@ -51,6 +51,9 @@ Options:
 EOF
        exit ${EX_USAGE}
 }
+injail() {
+       "$@"
+}

 ARCH=
 PTNAME=default
@@ -186,6 +189,7 @@ options_cleanup() {
        rm -f ${__MAKE_CONF}
 }
 setup_makeconf ${__MAKE_CONF} "${JAILNAME}" "${PTNAME}" "${SETNAME}"
+fetch_global_port_vars

 export TERM=${SAVED_TERM}
 for originspec in $(listed_ports show_moved); do