Solo5 / solo5

A sandboxed execution environment for unikernels
ISC License
883 stars 136 forks source link

Fix .bss initialization #551

Closed Kensan closed 1 year ago

Kensan commented 1 year ago

With the changes to the linker scripts in #546, the ELF segment containing the bss section is no longer of type PT_LOAD. This means it is not being processed by the Muen script and thus no longer initialized to zero. Explicitly set it to PT_LOAD in the linker script again to restore the original behavior. This was discovered when running Muen test unikernels on hardware, where memory is actually uninitialized and not zeroized as is done on most hypervisors.

Note that the same behavior goes for the common tender ELF loader, see https://github.com/Solo5/solo5/blob/7d5b70f5937abf1e18c2353ffb93d0581ec31a0f/tenders/common/elf.c#L249-L250, thus I believe the same issue is present and the same fix is necessary for the other bindings.

hannesm commented 1 year ago

Thanks for your report and pull request. I agree this is should be done (and adapted to hvt/spt at least). @palainp any remarks on this?

palainp commented 1 year ago

I'm sorry for that issue and I also agree that it should be fixed.

With Ocaml 5+ a PT_TLS section is now mandatory in the linker script with lld and I suppose that setting only .bss to PT_LOAD will report the same issue on the .tbss section, but setting .tbss as PT_LOAD too will remove the PT_TLS section that lld needs. @Kensan I don't use muen, could it be possible to try an unikernel that have a __thread int variable or run tests/tls.c with a clang/lld suite?

With Xen the zeroization of .tbss and .bss is done by giving _edata (end address of loaded code+data) and _ebss (end address of zeroed area): https://github.com/Solo5/solo5/blob/0eb8cb8f57943e7872a94e19695911a1e0b8aef6/bindings/xen/boot.S#L42.

Re-reading the elf loader, it seems to add the memsize of every PT_LOAD section (and wipe bss section https://github.com/Solo5/solo5/blob/7d5b70f5937abf1e18c2353ffb93d0581ec31a0f/tenders/common/elf.c#L282 and below) but it doesn't seem to be any trouble here (memory is probably already zeroed). I added phdr[ph_i].p_filesz == 0 || because if there is no __thread variable the .tdata section is empty (now it's at least 1 page do it's not needed anymore) and the following lines cause a failure.

Kensan commented 1 year ago

With Ocaml 5+ a PT_TLS section is now mandatory in the linker script with lld and I suppose that setting only .bss to PT_LOAD will report the same issue on the .tbss section, but setting .tbss as PT_LOAD too will remove the PT_TLS section that lld needs. @Kensan I don't use muen, could it be possible to try an unikernel that have a __thread int variable or run tests/tls.c with a clang/lld suite?

I tried running test_tls but it failed with a pagefault in set_data, when trying to write to the _data thread variable. I think the environment (on Muen) is not setup correctly for this use case. What is the expected behavior for the .tdata and .tbss sections, i.e. how should they be set up prior to execution of the unikernel?

palainp commented 1 year ago

.tdata is .data but for __thread variables (_data_not_bss in tests/test_tls.c), and the same for .tbss and .bss (_data in the test .c file). To me .tdata should be loaded as it's a PT_LOAD section, but .tbss probably suffers from the same issue as .bss before this PR (not loaded as it's PT_NULL). You probably can validate the correct loading of .tdata by copying lines 97-98 just before set_data(1);

Could it be possible to load PT_LOAD and PT_TLS in muen?

Kensan commented 1 year ago

I see, thanks for the information.

Could it be possible to load PT_LOAD and PT_TLS in muen?

Yes. I will look into it. The TLS test can probably be made to run without changes to Solo5.

Kensan commented 1 year ago

Since there seems to be agreement, that the fix for the .bss loading is also necessary for the other bindings, I will update this PR with the necessary changes.

dinosaure commented 1 year ago

Thanks!

dinosaure commented 1 year ago

So it's normal that bss does not have PT_LOAD for virto/xen? (/cc @palainp, @Kensan). Otherwise, I can do a quick patch to extend this PR with PT_LOAD on these targets.

Kensan commented 1 year ago

So it's normal that bss does not have PT_LOAD for virto/xen? (/cc @palainp, @Kensan). Otherwise, I can do a quick patch to extend this PR with PT_LOAD on these targets.

I did not change those linker scripts (and also stub) because from the discussion I got the impression that it is not necessary, due to the way they are loaded (Multiboot). From a consistency perspective it would probably be nicer, if the ELF files look the same. Plus, in my opinion it seems appropriate that the .bss segment would have type LOAD, but my knowledge of ELF is not all that deep.

Kensan commented 1 year ago

Could it be possible to load PT_LOAD and PT_TLS in muen?

Yes. I will look into it. The TLS test can probably be made to run without changes to Solo5.

I could get the TLS test to pass with only minor changes on the Muen side. Thanks @palainp for the discussion.

palainp commented 1 year ago

I'm also in favor of having coherent scripts when it isn't mandatory to put differences. So apply a similar change for the remaining scripts should be better to my opinion.

Thank you @Kensan for your time. Do you think we also need to update the elf reader in hvt and spt tenders in the same way you changes muen (whatever you've done)? I'm still a bit puzzled now loading bss but not tbss :)

Kensan commented 1 year ago

Thank you @Kensan for your time. Do you think we also need to update the elf reader in hvt and spt tenders in the same way you changes muen (whatever you've done)? I'm still a bit puzzled now loading bss but not tbss :)

My understanding is, that tbss (along with tdata and the tcb) is allocated on the heap for each thread, thus there is no need to have the section processed by the ELF loader. For Muen I had to do minor adjustments to correctly process sections that have a filesize of 0 but a virtual size > 0.

Regarding the ELF loader of HVT/SPT tenders: what was the reason that the zero size check was added? Is it still required? https://github.com/Solo5/solo5/blob/b9b2888911c9f5bb63932712c96b875a0a630930/tenders/common/elf.c#L249

palainp commented 1 year ago

Yes you're right, blanking .tbss should be done done on thread creation. I was wondering this because (at common/elf.c:252 and after) the elf loader is doing some things with addresses, and with a non empty .tbss section didn't follow this code path we might have bitroted values.

Concerning filesz check, this was needed (but not anymore, I added an alignment into the linker scripts to force .tdata to be non empty https://github.com/Solo5/solo5/blob/b9b2888911c9f5bb63932712c96b875a0a630930/bindings/solo5_hvt.lds#L129 <- this also might be something we don't want?) because .tdata could be empty and still PT_LOADed, failling one assertion later, but I can't remember where :(