ssrg-vt / popcorn-compiler

Popcorn Linux compiler toolchain for heterogeneous-ISA execution
41 stars 22 forks source link

Alignement of TLS symbols #9

Closed mohamed-karaoui closed 6 years ago

mohamed-karaoui commented 6 years ago

While trying to support the GNU-OMP library (openmp branch of the compiler, under the name libopenpop), I found a problem at the compilation time. This type of warning keep popping out: <>

Suspecting that the problem comes from the use of the thread local storage section ".tbss" (which are used by the library), I tried to compile this simple program that does not use libopenpop:

__thread volatile int x; int main() { return x+123; }

Using "readelf -S" we find that the problem comes the .tbss section and the next section (.data.rel.ro) having the same base address: ... [ 5] .eh_frame PROGBITS 000000000042b388 02b388 000040 00 A 0 0 8 [ 6] .tbss NOBITS 000000000043ffc0 02ffc0 000004 00 WAT 0 0 4 [ 7] .data.rel.ro PROGBITS 000000000043ffc0 02ffc0 000030 00 WA 0 0 8 ...

Any idea on how to solve the problem ? How to give each section its own base address? (I could use an ldscript for the compilation of the vanilla version and try to give both sections different addresses?)

olivierpierre commented 6 years ago

A few questions:

  1. Do you have the same issue with the program not using openmp?
  2. I would not be surprised if two sections have the same base address if one of them has a size of 0. However in your example .tbss has a size of 0x4 so I suspect something is wrong in the alignment tool, you should not have to manually align anything. I guess this is the output of readelf called on the final, aligned binary right?
  3. Can you paste here the actual error you get from pyalign? As well as the sources of the program you use to get this error (i.e. a way for me to reproduce the issue)
mohamed-karaoui commented 6 years ago
  1. Yes
  2. I have only WARNING at the alignement stage: WARNING: symbol does not have the same name as the containing section WARNING: Symbol: .data.rel.ro @0x43ffc0, size: 0x10 WARNING: Section: .tbss @0x43ffc0, size: 0x4

It actually fails later at the linking stage: ld.gold

  1. A simple program to reproduce the warning is: __thread volatile int x; int main() { return x+123; }
mohamed-karaoui commented 6 years ago

Also, the error produced by the ld script is ( when compiling a program with the openmp library):

/usr/local/popcorn/bin/ld.gold: internal error in relocate_tls, at ../../gold/aarch64.cc:7433 ../common.mk:172: recipe for target 'BT_aarch64' failed

olivierpierre commented 6 years ago

While trying to reproduce the issue, I see the warnings, however the linking process succeeds and the binaries seem to execute fine (I only tested the x86 one on my machine). Can you try with the latest commit on the master branch?

mohamed-karaoui commented 6 years ago

Pierre, the test program produces only the warning without the "LD failing". You need to use the Openmp library ("libopenpop") to see the LD failing.

mohamed-karaoui commented 6 years ago

This one line patch solves the problem. Still require some testing. So please test it, if possible, and let me know.

diff --git a/tool/alignment/pyalign/AbstractArchitecture.py b/tool/alignment/pyalign/AbstractArchitecture.py
index f3f3fbb..3baca2c 100644
--- a/tool/alignment/pyalign/AbstractArchitecture.py
+++ b/tool/alignment/pyalign/AbstractArchitecture.py
@@ -105,7 +105,7 @@ class AbstractArchitecture():
                        sectionAddr = section.getAddress()
                        sectionSize = section.getSize()
                        sectionName = section.getName()
-                       if (addr >= sectionAddr) and (addr < (sectionAddr + sectionSize)):
+                       if symbol.getName().startswith(sectionName):
                                res = sectionName
                                if not symbol.getName().startswith(sectionName):
                                # Sanity check if the names fit. is it possible to have a symbol
olivierpierre commented 6 years ago

I can see how this silences the warning but this might skip some symbols to align. In addition to solving the warming, does this also solves the linking error you get with the omp library?

mohamed-karaoui commented 6 years ago

Yes, the patch also solve the ld error.

olivierpierre commented 6 years ago

Ok cool, in that case a good way for you to test it is to run make check on a maximum of applications, it will list the unaligned symbols

mohamed-karaoui commented 6 years ago

Thanks for the make check tip.

Should we close the issue?

olivierpierre commented 6 years ago

Let's make sure that symbols are well aligned first, what is the output of make check on multi-threaded (as well as single threaded) applications?

mohamed-karaoui commented 6 years ago

I have tested it on blackscholes, NPB-SER and most NPB-OMP (I have another bug in some benchmarks: relocation overflow)

olivierpierre commented 6 years ago

nice! no unaligned symbol reported by make check for any benchmark?

mohamed-karaoui commented 6 years ago

Nop. everything went fine. But before merging I would like for someone else to test, if possible.

olivierpierre commented 6 years ago

Great! I do not have complex programs to test right now (I have a few test cases but the are still in the ppc branch so I'll have to merge that later), so I think you are the person with the most advanced test cases!

Go on and commit your modification, then you can close the issue!

mohamed-karaoui commented 6 years ago

Any simple program that you have will be good. The goal is to do some regression testing: hopping that I did not break any previously working program. Maybe Anthony can help with this, since he has his hands in this currently, If he can hear me? (I will ping him by email if not)

olivierpierre commented 6 years ago

All right, I think having a few test programs is easily doable and it will help for the alignment tool. Let me take care of that when I have more time during the break (I already have a test infrastructure somewhere on the ppc branch). I'm closing the issue and creating a new one for the alignment tool test cases.