dakanji / RefindPlus

A Boot Manager for Mac and PC
GNU General Public License v3.0
295 stars 64 forks source link

RefindPlus Hangs on Manual Linux Stanzas #163

Closed billabongbruno closed 8 months ago

billabongbruno commented 1 year ago

RefindPlus Version

v0.14.0.AA Release

Device Type

UEFI PC

Problem Description

0.14.0AA Release of RefindPlus fails to load any entries if manual stanzas are added to config.conf

The issue occurs with and without the use of a personal theme.

The previous version did not suffer from this issue, neither did a build I compiled roughly 3 or 4 weeks ago, from scratch.

Attaching log file below.

Problem Point

RefindPlus fails to load

Affected Items

All OS Loaders (First Row Items)

Debug Log

23k25y1306.log

Additional Context

No response

dakanji commented 1 year ago

Seems to be hanging.

Can you share the config file in use as well as a log running the previous version with the same config file?

billabongbruno commented 1 year ago

Greetings.

As per your request, I am attaching both the log and the config files of the previous version (0.13.0.AD).

FYI, procedures I tried with the new version (0.14.0.AA):

1 - Manually editing the config file to fit my preferences - Hangs; 2 - Tried a clean, default config, with no manual stanzas - Works perfectly; 3 - Tried to simply add the manual stanzas to the clean config - Hangs; 4 - Tried using the "old" config file (used in 0.13.0.AD) - Hangs.

Working_0.13.0.AD.zip

dakanji commented 1 year ago

Will take a look later. Do you mind sharing a log from 0.14.0.AA at Log Level 1 (With the stanzas) just in case?

Thanks

billabongbruno commented 1 year ago

Will take a look later. Do you mind sharing a log from 0.14.0.AA at Log Level 1 (With the stanzas) just in case?

Thanks

I am attaching two log files - one with log level set to 1 and one with log level set to default levels, by keeping it commented out.

As per your instructions, I set the log level to 1 and to my surprise, RefindPlus managed to recognize all stanzas (including manual ones) by simply uncommenting the log_level setting, which strikes me as odd.

log_level_1_23k26x0200.log log_level_commented_out_23k26x0407.log

dakanji commented 1 year ago

Do you mean to say the issue is only present when log_level is active and set to 0 but is not present when commented out or set to 1?

billabongbruno commented 1 year ago

Do you mean to say the issue is only present when log_level is active and set to 0 but is not present when commented out or set to 1?

Assuming that log_level is set to 0 when commented (if that is the default value), then yes.

The different settings of log_level I tried were: 1 - The default one, where it's commented (failed to load RefindPlus correctly with manual stanzas) 2 - Uncommented it and just kept it at log_level=1 (As per your request) (RefindPlus worked as expected)

The issue only exists if the log_level is commented out (has the # behind it). If I uncomment it (remove the # and set the log_level to 1), RefindPlus finds all stanzas and loads (at least) a Ubuntu kernel directly.

dakanji commented 1 year ago

The issue only exists if the log_level is commented out (has the # behind it).

Questions:

  1. How about when not commented but is set to 0?
  2. How about the REL file?
billabongbruno commented 1 year ago

1 - I have not yet tried setting the log_level to 0, I can give that a try right now. 2 - The REL file never worked for me, which is why I tried the DBG version. Does the REL work with log_level as well? I assumed it didn't. If it does, I can give it a go too.

I edited my previous comment with additional info, I don't know if you saw it.

dakanji commented 1 year ago

Please try setting to 0 explicitly and also try 0 & 1 in the REL file

billabongbruno commented 1 year ago

Please try setting to 0 explicitly and also try 0 & 1 in the REL file

As per your request, I tried the following variants:

CASE 1: DEBUG version, log_level uncommented and set to 0 - Produced a log file (expected), failed to work CASE 2: DEBUG version, log_level uncommented and set to 1 - Produced a log file (expected), worked CASE 3: RELEASE version, log_level uncommented and set to 0 - Did not produce a log file (expected), failed to work CASE 4: RELEASE version, log_level uncommented and set to 1 - Did not produce a log file (expected), failed to work

I am attaching the only two log files that were produced in the process, with the DEBUG version (to the best of my knowledge only DBG version produces log files) DBG_Uncommented_log_level_0-NOT-WORKING.log DBG_Uncommented_log_level_1-WORKING.log

dakanji commented 1 year ago

Please add disabled to all the manual stanzas after the menuentry line (as in the examples) then run with log_level 0 just to reconfirm it works.

Afterwards, start removing the disabled from the last stanza upwards one at a time and let me know how far you get before it fails. Obviously exclude any example ones you have in the config.

billabongbruno commented 1 year ago

Please add disabled to all the manual stanzas after the menuentry line (as in the examples) then run with log_level 0 just to reconfirm it works.

Afterwards, start removing the disabled from the last stanza upwards one at a time and let me know how far you get before it fails. Obviously exclude any example ones you have in the config.

As per your request, I tried the following variants:

CASE 1: DEBUG version, log_level uncommented and set to 0, disabled ALL stanzas - Produced a log file (expected), WORKS

CASE 2: DEBUG version, log_level uncommented and set to 0, enabled ONLY Grub2Win EFI - Produced a log file (expected), WORKS

CASE 3: DEBUG version, log_level uncommented and set to 0, enabled Grub2Win EFI && OpenCore EFI - Produced a log file (expected), WORKS

CASE 4: DEBUG version, log_level uncommented and set to 0, enabled Grub2Win EFI && OpenCore EFI && Windows EFI - Produced a log file (expected), WORKS

CASE 5: DEBUG version, log_level uncommented and set to 0, enabled Grub2Win EFI && OpenCore EFI && Windows EFI && Ubuntu Kernel - Produced a log file (expected), NOT WORKING

CASE 6: DEBUG version, log_level uncommented and set to 0, enabled ALL stanzas - Produced a log file (expected), NOT WORKING

CASE 7: DEBUG version, log_level uncommented and set to 0, enabled Grub2Win EFI && OpenCore EFI && Windows EFI && Kali Kernel - Produced a log file (expected), NOT WORKING

CASE 8: Same as case 1, but RELEASE VERSION - No log file produced (expected)

CASE 9: Same as case 2, but RELEASE VERSION - No log file produced (expected)

CASE 10: Same as case 3, but RELEASE VERSION - No log file produced (expected)

CASE 11: Same as case 4, but RELEASE VERSION - No log file produced (expected)

CASE 12: Same as case 5, but RELEASE VERSION - No log file produced (expected)

CASE 13: Same as case 6, but RELEASE VERSION - No log file produced (expected)

CASE 14: Same as case 7, but RELEASE VERSION - No log file produced (expected)

I am attaching the log files that were produced in the process, with the DEBUG version.

01-DBG_Uncommented_log_level_0_All_Stanzas_Disabled-WORKING.log 02-DBG_Uncommented_log_level_0_Enabled_GRUB2WIN_Only-WORKING.log 03-DBG_Uncommented_log_level_0_Enabled_GRUB2WIN_and_OpenCore-WORKING.log 04-DBG_Uncommented_log_level_0_Enabled_GRUB2WIN_and_OpenCore_and_Windows-WORKING.log 05-DBG_Uncommented_log_level_0_Enabled_GRUB2WIN_and_OpenCore_and_Windows_and_Ubuntu-NOT_WORKING.log 06-DBG_Uncommented_log_level_0_All_Stanzas_Enabled-NOT_WORKING.log 07-DBG_Uncommented_log_level_0_Enabled_GRUB2WIN_and_OpenCore_and_Windows_and_Kali-NOT_WORKING.log

dakanji commented 1 year ago

Thanks. Will pick this up tomorrow.

billabongbruno commented 1 year ago

Thanks. Will pick this up tomorrow.

Thank you for your time.

dakanji commented 1 year ago

Please try X306 with the stanzas active: X306-BOOTx64.zip

billabongbruno commented 1 year ago

Please try X306 with the stanzas active: X306-BOOTx64.zip

Greetings.

I was unable to test on the same machine, seeing as I am at work and did not bring the laptop with me. However, I have another UEFI PC (Surface Pro 4) which had the exact same configuration (apart from the UUID and PARTUUID, of course). Results were the same as the previous iteration and are listed below:

CASE 1: X306 version, log_level commented, all stanzas ENABLED - Produced a log file, failed to work. CASE 2: X306 version, log_level commented, Linux stanzas DISABLED - Produced a log file, worked. CASE 3: X306 version, log_level uncommented and set to 1, all stanzas ENABLED - Produced a log file, failed to work. CASE 4: X306 version, log_level uncommented and set to 1, Linux stanzas DISABLED - Produced a log file, worked.

Attaching the log files below:

01-X306_Commented_log_level_All_Stanzas_Enabled-NOT_WORKING.log 02-X306_Commented_log_level_Linux_Stanzas_Disabled-WORKING.log 03-X306_Uncommented_log_level_1_All_Stanzas_Enabled-NOT_WORKING.log 04-X306_Uncommented_log_level_1_Linux_Stanzas_Disabled-WORKING.log

dakanji commented 1 year ago

Thanks.

Try changing the Volume token in the stanzas from GUID to the volume names. That is, Kali and Ubuntu

billabongbruno commented 1 year ago

Thanks.

Try changing the Volume token in the stanzas from GUID to the volume names. That is, Kali and Ubuntu

No change, still not working.

Attaching both the log file and the config.conf as a .txt file (simply remove the .txt extension) so that upload would be possible without compressing it into a zip file.

config.conf.txt 23k27k1310.log

dakanji commented 1 year ago

Please try X307 with the GUIDs restored: X307-BOOTx64.zip

BTW, you only need to test your last CASE 1.

Thanks

billabongbruno commented 1 year ago

Please try X307 with the GUIDs restored: X307-BOOTx64.zip

BTW, you only need to test your last CASE 1.

Thanks

Thank you.

Same outcome, unfortunately.

Attaching log file below.

23k27m2112.log

Regarding the amount of testing, I understand that and I was only trying to provide as much information as possible, with different use cases. I will test with the CASE 1 settings only, from now on, that being: Commented log_level, All Stanzas Enabled, GUID under Volume entry.

dakanji commented 1 year ago

Thanks.

On the testing, just wanted to spare you the time as the baseline on those cases are now clear.

Are you using QEMU or similar?

billabongbruno commented 1 year ago

Thanks.

On the testing, just wanted to spare you the time as the baseline on those cases are now clear.

Are you using QEMU or similar?

Thank you. I understood that. I just wasn't clear if the baseline was set or not. Thankfully it is.

No, I haven't touched QEMU at all.

I just run RefindPlus as is and have clean installs of all OS'es present. Kali and Ubuntu are the official distros from their respective websites.

No type of emulation is being used.

On a side note: The only two OS'es that are being chainloaded are macOS and ChromeOS (macOS with OpenCore and ChromeOS with Grub2Win), with both timeouts set to 0 and pickers hidden, so that I have a smooth, seamless boot-up of all OS, which is why I load the Linux Kernels directly, instead of chainloading their respective GRUB.

dakanji commented 1 year ago

No, I haven't touched QEMU at all ... No type of emulation is being used.

Thanks. Reason I asked is that the only other time I have come across this was with QEMU in use:

                     - Locate Console Control
                       * Seek on ConsoleOut Handle ... Not Found
                       * Seek on GPU Handle Buffer ... Not Found
                     - Assess Console Control ... NOT OK!!

The person reporting essentially had the same issue: https://sf.net/p/refind/discussion/general/thread/4dfcdfdd16/?limit=25#0f24.

Basically, the log_level and manual stanza angles are red herrings and the real issue is a memory conflict somewhere that is hit due to different memory use profiles when one log level is set but not when the other is set with certain tokens active in manual stanzas. In that other person's case, it was with the banner for the memory use profile in the then current versions of rEFInd and RefindPlus.

I added a workaround for the banner at that time but with both banner and log_level, the real issue is still there sitting somewhere deep in the code (all the way from rEFInd) and can be hit at any time based on the current memory use profile since it has not actually been found. Only apparent clue so far is that Console Control is not found.

A few more commits, the use profile might shift and the manifestation go away, moves to some obscure setting or becomes worse. Worse is actually better as improves the chances of finding the conflict but unfortunately, I don't have a solution at this time.

Will leave open for now and try to think of potential workarounds but real solution is to find and fix the conflict. Could be that fixing Console Control when this is absent is where the focus should be.

You can try the default banner instead of a custom one as a wild test.

billabongbruno commented 1 year ago

be.

I see. Thank you for taking the time to explain all of this.

How would I go about changing the settings so that the banner is the default one, seeing as I have no recollection of having changed it all? (I am aware that the latest tests were done using the custom theme, is that what you are referring to?) If that's the case, I can try not loading the theme and see what happens, even though it didn't work with the 0.14 RELEASE version (nor the DEBUG, apart from when having log_level set to 1).

dakanji commented 1 year ago

Yes, just disable the custom theme. I don't expect it will make a difference but just to tick that off.

billabongbruno commented 1 year ago

Yes, just disable the custom theme. I don't expect it will make a difference but just to tick that off.

Sadly, it didn't make a difference. Same outcome, didn't work.

Attaching log file below. 23k27n1726.log

I should, however, mention that upon following the link you provided, I took a quick look and saw that the "hideui banner" option WAS a thing in rEFInd (not sure if RefinPlus ever had it or if 0.14 version supports it).

Nevertheless, I tried adding the argument to the proper section in the config.conf file, only to still be presented to the banner - meaning the option had no effect on hiding the banner. I am unsure if this is of any help, but I figured I would just mention it. (It was another test and I deleted the log, but I can replicate it, should you wish it).

dakanji commented 1 year ago

I figured I would just mention it

Thanks. hideui banner (or any other rEFInd flag) should work in RefindPlus but let's leave that for now and revisit later.

Try X308 for the main issue: X308-BOOTx64.zip

Just has some items on Console Control

billabongbruno commented 1 year ago

I figured I would just mention it

Thanks. hideui banner (or any other rEFInd flag) should work in RefindPlus but let's leave that for now and revisit later.

Try X308 for the main issue: X308-BOOTx64.zip

Just has some items on Console Control

Oh, I wasn't trying to imply there was another issue. Sorry if it seemed that way.

My approach was in the sense that it could be helpful in sorting the main issue.

Will try X308 now and post results in a couple of minutes.

billabongbruno commented 1 year ago

I figured I would just mention it

Thanks. hideui banner (or any other rEFInd flag) should work in RefindPlus but let's leave that for now and revisit later.

Try X308 for the main issue: X308-BOOTx64.zip

Just has some items on Console Control

X308 is also non-functional. Same outcome.

EDIT: I took care not to include my personal theme in config.conf, so the banner being used is the default one.

23k27p3433.log

dakanji commented 1 year ago

Ok. Thanks.

dakanji commented 1 year ago

Please try X309: X309-BOOTx64.zip

Ignore the custom/default banner thing and just use your preferred banner.

Fixed File ... Was X308 Before

billabongbruno commented 1 year ago

Please try X309: X309-BOOTx64.zip

Ignore the custom/default banner thing and just use your preferred banner.

Fixed File ... Was X308 Before

Sorry for the late reply.

X309 is also non-functional.

23k27v5332.log

dakanji commented 1 year ago

Thanks.

Try rerunning that and activate textonly as a last roll of the dice.

billabongbruno commented 1 year ago

Thanks.

Try rerunning that and activate textonly as a last roll of the dice.

Sadly, it did not work either.

23k27v0945.log

dakanji commented 1 year ago

Unfortunately, I'm out out ideas at this point.

Will flag this as a known issue and be on the look out for potential fixes. You might want to stick to the older working version in the interim.

Thanks again.

dakanji commented 1 year ago

You might want to stick to the older working version in the interim.

Just struck me that as it doesn't seem you are doing anything special with the manual stanzas, you should also be able to just let RefindPlus scan for the volumes as internal and remove them from stanzas. This should allow running the current version.

billabongbruno commented 1 year ago

Unfortunately, I'm out out ideas at this point.

Will flag this as a known issue and be on the look out for potential fixes. You might want to stick to the older working version in the interim.

Thanks again.

No worries, I'll stick to the old version. Thank you for your time.

billabongbruno commented 1 year ago

You might want to stick to the older working version in the interim.

Just struck me that as it doesn't seem you are doing anything special with the manual stanzas, you should also be able to just let RefindPlus scan for the volumes as internal and remove them from stanzas. This should allow running the current version.

Oh, using the manual stanzas was by design. I want them in that specific order and I want no other stanza to appear (even though I am aware that I can hide stanzas).

I could follow your advice, but if I'm not mistaken, with "internal", the order is defined by the order in which RefindPlus finds the loaders, which is something I'd prefer to have control over, hence the manual stanzas.

I also like to get rid of all Linux Kernel related warnings, such as SGX and NVidia-related stuff, which would not happen if I used "internal" instead (the Surface Pro does not have any NVidia dGPU, but it does have SGX and it's not accessible via any configuration, being it UEFI BIOS ou software controlled under any OS. Therefore I can't enable SGX to disable the warning, hence the argument in the appropriate Linux kernels).

It's more of a personal preference, really.

Also, I believe I'd have to adjust the search depth to 2 or more because of Grub2Win, which would result in a lot more stanzas showing up (even though I would be able to hide them, yes).

I have no issue in remaining in the previous version, as it already seemed flawless to me. I just tried to update because, you know, latest and greatest. But no worries.

Again, thank you for your work, time and patience.

dakanji commented 1 year ago

I'd prefer to have control

You could get quite close with the attached config: config.conf.txt

Edits to the stanzas along with dont_scan_volumes and scanfor You can use the dont_scan_XYZ tokens to exclude other stuff you don't want to appear from the internal scan and only let it handle those two.

EDIT: Should have been dont_scan_files shim.efi,MokManager.efi,PreLoader.efi,boot.efi,BOOTx64.efi,OpenCore.efi,bootmgfw.efi,gnugrub.kernel64.efi

billabongbruno commented 1 year ago

I'd prefer to have control

You could get quite close with the attached config: config.conf.txt

Edits to the stanzas along with dont_scan_volumes and scanfor You can use the dont_scan_XYZ tokens to exclude other stuff you don't want to appear from the internal scan and only let it handle those two.

EDIT: Should have been dont_scan_files shim.efi,MokManager.efi,PreLoader.efi,boot.efi,BOOTx64.efi,OpenCore.efi,bootmgfw.efi,gnugrub.kernel64.efi

Good suggestion.

I'll give it a go once I get home and report back.

Thank you once again and "see" you soon.

dakanji commented 1 year ago

Actually, there are a few things we could look at to try to better understand the issue or find a workaround. So, if you are up for it, please try X310: X310-BOOTx64.zip

There are no fixes included. Just trying to find the "breaking" point. So, use your original setup with your theme etc and do CASE 1 from before.

Thanks

billabongbruno commented 1 year ago

I'd prefer to have control

You could get quite close with the attached config: config.conf.txt Edits to the stanzas along with dont_scan_volumes and scanfor You can use the dont_scan_XYZ tokens to exclude other stuff you don't want to appear from the internal scan and only let it handle those two. EDIT: Should have been dont_scan_files shim.efi,MokManager.efi,PreLoader.efi,boot.efi,BOOTx64.efi,OpenCore.efi,bootmgfw.efi,gnugrub.kernel64.efi

Good suggestion.

I'll give it a go once I get home and report back.

Thank you once again and "see" you soon.

It works, almost perfectly.

billabongbruno commented 1 year ago

Actually, there are a few things we could look at to try to better understand the issue or find a workaround. So, if you are up for it, please try X310: X310-BOOTx64.zip

There are no fixes included. Just trying to find the "breaking" point. So, use your original setup with your theme etc and do CASE 1 from before.

Thanks

Hey there.

I am absolutely more than willing to try.

So, I went ahead and tried it. And it worked????????????? Even though no fixes were included?

So weird.

I made a mistake and booted with textonly first. Then corrected it and booted with the "normal custom" setup, including theme.

Boot time is a little higher than 0.13, but it did work.

Attaching both log files.

textonly.log normal.log

dakanji commented 1 year ago

Well, I remembered that you mentioned that you did a build a few weeks ago and that this worked. This means it broke in one of the six commits that went in after (I looked at your fork). So we will test builds at each one of those six to identify which was the breaking commit, then drill down to find the breaking file and then the breaking line. This line can then be studied to figure things out.

So, one commit down and five more to test: X311-BOOTx64.zip

billabongbruno commented 1 year ago

Well, I remembered that you mentioned that you did a build a few weeks ago and that this worked. This means it broke in one of the six commits that went in after (I looked at your fork). So we will test builds at each one of those six to identify which was the breaking commit, then drill down to find the breaking file and then the breaking line. This line can then be studied to figure things out.

So, one commit down and five more to test: X311-BOOTx64.zip

I see. Good thinking. Hopefully the 6 commits aren't that extensive.

Will try now and report back.

billabongbruno commented 1 year ago

Well, I remembered that you mentioned that you did a build a few weeks ago and that this worked. This means it broke in one of the six commits that went in after (I looked at your fork). So we will test builds at each one of those six to identify which was the breaking commit, then drill down to find the breaking file and then the breaking line. This line can then be studied to figure things out.

So, one commit down and five more to test: X311-BOOTx64.zip

311 did NOT work. Hopefully it isn't too much to debug and the change introduced was the only one that broke support for manual Linux stanzas.

Attaching log file.

23k28m0506.log

dakanji commented 1 year ago

Please try X311a: X311a-BOOTx64.zip

billabongbruno commented 1 year ago

Please try X311a: X311a-BOOTx64.zip

X311a is WORKING and boot time seems close to RELEASE. That was quick. lol May I ask what the problem was exactly?

23k28m1221.log

dakanji commented 1 year ago

broke support for manual Linux stanzas

BTW, the issue is not to do with Linux or with stanzas as explained earlier. It is a memory conflict issue that manifests, or not, at random points such as in your case, when certain lines in stanzas are being processed (regardless of whether they are Linux or something else. When another commit goes in and memory usage patterns change, it can shift to something else altogether.

BTW, I cannot reproduce the issue on my machine (with your config file) because the memory usage profile is different and the conflict does not show up with stanzas at least. It may show up with something else though.

When the other fellow was reporting an issue in rEFInd, it was manifesting with the banners. It is the same root memory conflict issue That has been present for a long time. Just shows up in different ways as the patterns change.

dakanji commented 1 year ago

X311a is WORKING and boot time seems close to RELEASE.

Just means the breaking item was not in the files tested. The process is not complete