rhkdump / kdump-utils

Kernel crash dump collection utilities
GNU General Public License v2.0
3 stars 8 forks source link

Remove default debug option #12

Closed prudo1 closed 2 weeks ago

prudo1 commented 1 month ago

This series removes the -d option set per default to kexec followed by some small cleanup patches. Please see the patch description for details

baoquan-he commented 1 month ago

Hi Philipp, If people use "-c" I think a warning about not supported should be enough instead of overwriting with "-s" so that we have a chance to use "-c" for testing purpose in case there are some upstream requests. If we overwriting with "-s" then it is hard to enable it anymore. It can be up to user to remove the "-c". See what do you and Baoquan think

I agree with Dave. And I am not sure if it's OK to hardcode the '-s' in kdumpctl. With this, user may not have chance to try '-c', unless they need and know how to edit the code. Imagine one day, we encounter a bug on kexec_file_load and have no clue at all. We may want to try kexec_load to compare.

prudo1 commented 1 month ago

There's a typo in log:s/kdump_load/kexec_load/,

Thanks... fixed it locally

prudo1 commented 1 month ago

Hi Philipp, If people use "-c" I think a warning about not supported should be enough instead of overwriting with "-s" so that we have a chance to use "-c" for testing purpose in case there are some upstream requests. If we overwriting with "-s" then it is hard to enable it anymore. It can be up to user to remove the "-c". See what do you and Baoquan think

I agree with Dave. And I am not sure if it's OK to hardcode the '-s' in kdumpctl. With this, user may not have chance to try '-c', unless they need and know how to edit the code. Imagine one day, we encounter a bug on kexec_file_load and have no clue at all. We may want to try kexec_load to compare.

Hi Dave, Hi Baoquan,

thanks for taking a look. Please note that I don't intend to backport this commit to RHEL9 where -c is still an option. But for Fedora and RHEL10 I believe this commit makes sense. There the kernel is/will be compiled without CONFIG_KEXEC. So option -c won't work and kexec-tools will return with an error. In that case printing a warning plus automatically falling back to -s is the better option in my opinion.

I'll take a look on how the two commits concerning kexec_file_load can be split up better. So backporting the warning can be done easier without setting -s per default.

daveyoung commented 1 month ago

Hi Philipp, If people use "-c" I think a warning about not supported should be enough instead of overwriting with "-s" so that we have a chance to use "-c" for testing purpose in case there are some upstream requests. If we overwriting with "-s" then it is hard to enable it anymore. It can be up to user to remove the "-c". See what do you and Baoquan think

I agree with Dave. And I am not sure if it's OK to hardcode the '-s' in kdumpctl. With this, user may not have chance to try '-c', unless they need and know how to edit the code. Imagine one day, we encounter a bug on kexec_file_load and have no clue at all. We may want to try kexec_load to compare.

Hi Dave, Hi Baoquan,

thanks for taking a look. Please note that I don't intend to backport this commit to RHEL9 where -c is still an option. But for Fedora and RHEL10 I believe this commit makes sense. There the kernel is/will be compiled without CONFIG_KEXEC. So option -c won't work and kexec-tools will return with an error. In that case printing a warning plus automatically falling back to -s is the better option in my opinion.

I'll take a look on how the two commits concerning kexec_file_load can be split up better. So backporting the warning can be done easier without setting -s per default.

Hi Philipp, hmm. I think the removing of "-d" is an independent issue, can you post it separately?

daveyoung commented 1 month ago

Hi Philipp, If people use "-c" I think a warning about not supported should be enough instead of overwriting with "-s" so that we have a chance to use "-c" for testing purpose in case there are some upstream requests. If we overwriting with "-s" then it is hard to enable it anymore. It can be up to user to remove the "-c". See what do you and Baoquan think

I agree with Dave. And I am not sure if it's OK to hardcode the '-s' in kdumpctl. With this, user may not have chance to try '-c', unless they need and know how to edit the code. Imagine one day, we encounter a bug on kexec_file_load and have no clue at all. We may want to try kexec_load to compare.

Hi Dave, Hi Baoquan, thanks for taking a look. Please note that I don't intend to backport this commit to RHEL9 where -c is still an option. But for Fedora and RHEL10 I believe this commit makes sense. There the kernel is/will be compiled without CONFIG_KEXEC. So option -c won't work and kexec-tools will return with an error. In that case printing a warning plus automatically falling back to -s is the better option in my opinion. I'll take a look on how the two commits concerning kexec_file_load can be split up better. So backporting the warning can be done easier without setting -s per default.

Hi Philipp, hmm. I think the removing of "-d" is an independent issue, can you post it separately?

Hi Philipp, maybe I still do not understand why the "-c/-s" auto fallback is needed. If people use "-c" kexec load will finally fail and we can print error about this is not supported, it should be enough, I can not convince myself to auto fallback to -s as if so we will be hard to test kexec -c with upstream customized kernel, so the code is not harmful at least.. Frankly ditto about the i386 part, even if it is not used, but not harmful, but maybe drop ix86 is not so bad. Anyway it can be another pr if needed.

prudo1 commented 3 weeks ago

Hi Dave, Baoquan and I discussed this in our last team event. My rational was that Baoquan unintentionally disabled kexec_load for Fedora as well when he disabled it for RHEL. So using the -c option on a current Rawhide kernel leads to the following error message

# kdumpctl start
kexec_load failed: Function not implemented
entry       = 0xb2f54760 flags = 0x3e0001
nr_segments = 7
segment[0].buf   = 0x55d8cef2e9c0
segment[0].bufsz = 0x30
segment[0].mem   = 0xa3000000
segment[0].memsz = 0x1000
segment[1].buf   = 0x7f400f600010
segment[1].bufsz = 0x1c98600
segment[1].mem   = 0xac367000
segment[1].memsz = 0x1c99000
segment[2].buf   = 0x7f4011405010
segment[2].bufsz = 0xf3c968
segment[2].mem   = 0xae000000
segment[2].memsz = 0x44b9000
segment[3].buf   = 0x55d8cef4e4b0
segment[3].bufsz = 0x51a9
segment[3].mem   = 0xb2f4e000
segment[3].memsz = 0x6000
segment[4].buf   = 0x55d8cef473c0
segment[4].bufsz = 0x70e0
segment[4].mem   = 0xb2f54000
segment[4].memsz = 0x9000
segment[5].buf   = 0x55d8cef45ab0
segment[5].bufsz = 0x400
segment[5].mem   = 0xb2f5d000
segment[5].memsz = 0x4000
segment[6].buf   = 0x7f401244d010
segment[6].bufsz = 0x9e800
segment[6].mem   = 0xb2f61000
segment[6].memsz = 0x9f000
kdump: kexec: failed to load kdump kernel
kdump: Starting kdump: [FAILED]

which I find rather nasty. That's why I thought it would be better to fallback to -s. So if a user has option -c set on KEXEC_ARGS and upgrades to a new kernel it only prints an error message but still works.

However, Baoquan and I decided it's better to re-enable kexec_load for Fedora so we have a fallback mechanism in case there is a bug in kexec_file_load upstream without the need to recompile the kernel. I'll prepare a MR for kernel-ark for that and will rework this series.

And I'll also split out the commits that are not related to removing the default -d option to my other series that already contains various fixes for bugs found in the last test cycle.

prudo1 commented 3 weeks ago

BTW, I've created an MR to re-enable kexec_load on Fedora

prudo1 commented 3 weeks ago

Dropped all but the first commit and moved the commit that rewrites load_kdump to PR #18

baoquan-he commented 3 weeks ago

This MR looks good to me, ACK.

daveyoung commented 3 weeks ago

Hi Baoquan, ACK if not visible in the pr interface, there is an "approve" action so it can be easier visible to the maintainer. But anyway a reminder, @Coiby Xu @.***> which way do you prefer?

On Mon, 1 Jul 2024 at 11:32, Baoquan He @.***> wrote:

This MR looks good to me, ACK.

— Reply to this email directly, view it on GitHub https://github.com/rhkdump/kdump-utils/pull/12#issuecomment-2199162542, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOAKTJQCALVN3S4PIRIZRLZKDETDAVCNFSM6AAAAABJKMNEM2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJZGE3DENJUGI . You are receiving this because you commented.Message ID: @.***>

baoquan-he commented 3 weeks ago

Hi Baoquan, ACK if not visible in the pr interface, there is an "approve" action so it can be easier visible to the maintainer. But anyway a reminder, @coiby Xu @.***> which way do you prefer?

I tried to approve, while I couldn't find button to do.

coiby commented 2 weeks ago

Hi @daveyoung,

Yes, an approve action makes it easier for me to tell which PR has been reviewed. So yes, I prefer an approve action. In the future, I may also experiment with merging PR automatically when for example two people have approved the PR.