Closed Kensan closed 1 year ago
Thanks for your report and pull request. I agree this is should be done (and adapted to hvt/spt at least). @palainp any remarks on this?
I'm sorry for that issue and I also agree that it should be fixed.
With Ocaml 5+ a PT_TLS section is now mandatory in the linker script with lld and I suppose that setting only .bss
to PT_LOAD will report the same issue on the .tbss
section, but setting .tbss
as PT_LOAD too will remove the PT_TLS section that lld needs. @Kensan I don't use muen, could it be possible to try an unikernel that have a __thread int variable
or run tests/tls.c
with a clang/lld suite?
With Xen the zeroization of .tbss
and .bss
is done by giving _edata
(end address of loaded code+data) and _ebss
(end address of zeroed area): https://github.com/Solo5/solo5/blob/0eb8cb8f57943e7872a94e19695911a1e0b8aef6/bindings/xen/boot.S#L42.
Re-reading the elf loader, it seems to add the memsize of every PT_LOAD
section (and wipe bss section https://github.com/Solo5/solo5/blob/7d5b70f5937abf1e18c2353ffb93d0581ec31a0f/tenders/common/elf.c#L282 and below) but it doesn't seem to be any trouble here (memory is probably already zeroed).
I added phdr[ph_i].p_filesz == 0 ||
because if there is no __thread
variable the .tdata
section is empty (now it's at least 1 page do it's not needed anymore) and the following lines cause a failure.
With Ocaml 5+ a PT_TLS section is now mandatory in the linker script with lld and I suppose that setting only
.bss
to PT_LOAD will report the same issue on the.tbss
section, but setting.tbss
as PT_LOAD too will remove the PT_TLS section that lld needs. @Kensan I don't use muen, could it be possible to try an unikernel that have a__thread int variable
or runtests/tls.c
with a clang/lld suite?
I tried running test_tls
but it failed with a pagefault in set_data
, when trying to write to the _data
thread variable. I think the environment (on Muen) is not setup correctly for this use case. What is the expected behavior for the .tdata
and .tbss
sections, i.e. how should they be set up prior to execution of the unikernel?
.tdata
is .data
but for __thread
variables (_data_not_bss
in tests/test_tls.c
), and the same for .tbss
and .bss
(_data
in the test .c file).
To me .tdata
should be loaded as it's a PT_LOAD section, but .tbss
probably suffers from the same issue as .bss
before this PR (not loaded as it's PT_NULL). You probably can validate the correct loading of .tdata
by copying lines 97-98 just before set_data(1);
Could it be possible to load PT_LOAD and PT_TLS in muen?
I see, thanks for the information.
Could it be possible to load PT_LOAD and PT_TLS in muen?
Yes. I will look into it. The TLS test can probably be made to run without changes to Solo5.
Since there seems to be agreement, that the fix for the .bss loading is also necessary for the other bindings, I will update this PR with the necessary changes.
Thanks!
So it's normal that bss
does not have PT_LOAD
for virto
/xen
? (/cc @palainp, @Kensan). Otherwise, I can do a quick patch to extend this PR with PT_LOAD
on these targets.
So it's normal that
bss
does not havePT_LOAD
forvirto
/xen
? (/cc @palainp, @Kensan). Otherwise, I can do a quick patch to extend this PR withPT_LOAD
on these targets.
I did not change those linker scripts (and also stub
) because from the discussion I got the impression that it is not necessary, due to the way they are loaded (Multiboot). From a consistency perspective it would probably be nicer, if the ELF files look the same. Plus, in my opinion it seems appropriate that the .bss
segment would have type LOAD, but my knowledge of ELF is not all that deep.
Could it be possible to load PT_LOAD and PT_TLS in muen?
Yes. I will look into it. The TLS test can probably be made to run without changes to Solo5.
I could get the TLS test to pass with only minor changes on the Muen side. Thanks @palainp for the discussion.
I'm also in favor of having coherent scripts when it isn't mandatory to put differences. So apply a similar change for the remaining scripts should be better to my opinion.
Thank you @Kensan for your time. Do you think we also need to update the elf reader in hvt and spt tenders in the same way you changes muen (whatever you've done)? I'm still a bit puzzled now loading bss but not tbss :)
Thank you @Kensan for your time. Do you think we also need to update the elf reader in hvt and spt tenders in the same way you changes muen (whatever you've done)? I'm still a bit puzzled now loading bss but not tbss :)
My understanding is, that tbss (along with tdata and the tcb) is allocated on the heap for each thread, thus there is no need to have the section processed by the ELF loader. For Muen I had to do minor adjustments to correctly process sections that have a filesize of 0 but a virtual size > 0.
Regarding the ELF loader of HVT/SPT tenders: what was the reason that the zero size check was added? Is it still required? https://github.com/Solo5/solo5/blob/b9b2888911c9f5bb63932712c96b875a0a630930/tenders/common/elf.c#L249
Yes you're right, blanking .tbss
should be done done on thread creation. I was wondering this because (at common/elf.c:252
and after) the elf loader is doing some things with addresses, and with a non empty .tbss
section didn't follow this code path we might have bitroted values.
Concerning filesz
check, this was needed (but not anymore, I added an alignment into the linker scripts to force .tdata
to be non empty https://github.com/Solo5/solo5/blob/b9b2888911c9f5bb63932712c96b875a0a630930/bindings/solo5_hvt.lds#L129 <- this also might be something we don't want?) because .tdata
could be empty and still PT_LOADed, failling one assertion later, but I can't remember where :(
With the changes to the linker scripts in #546, the ELF segment containing the bss section is no longer of type PT_LOAD. This means it is not being processed by the Muen script and thus no longer initialized to zero. Explicitly set it to PT_LOAD in the linker script again to restore the original behavior. This was discovered when running Muen test unikernels on hardware, where memory is actually uninitialized and not zeroized as is done on most hypervisors.
Note that the same behavior goes for the common tender ELF loader, see https://github.com/Solo5/solo5/blob/7d5b70f5937abf1e18c2353ffb93d0581ec31a0f/tenders/common/elf.c#L249-L250, thus I believe the same issue is present and the same fix is necessary for the other bindings.