leaningtech / cheerp-meta

Cheerp - a C/C++ compiler for Web applications - compiles to WebAssembly and JavaScript
https://labs.leaningtech.com/cheerp
Other
1.03k stars 51 forks source link

Overwrite unexpected struct filed #118

Closed zyz9740 closed 8 months ago

zyz9740 commented 2 years ago

issue.zip

When I use a pointer to overwrite the value of a field in a struct, another field is modified unexpected.

carlopi commented 2 years ago

Hi! I believe this issue has been solved after you raised a similar problem: https://github.com/leaningtech/cheerp-meta/issues/114.

Could you possibly try with the nightly packages of Cheerp?

zyz9740 commented 2 years ago

I see this issue https://github.com/leaningtech/cheerp-meta/issues/116 and I encounter nearly the same problem as him, thus preventing me from getting the lastest cheerp.

How can I get the nightly packages of Cheerp ? Or could you please provide the lastest build instructions for me ? Thanks a lot !

carlopi commented 2 years ago

I will add instructions to the documentation, but if you are on Ubuntu it should be a matter of doing:

sudo add-apt-repository ppa:leaningtech-dev/cheerp-nightly-ppa
sudo apt update

followed by

apt-get install cheerp-core

At this page more info: https://launchpad.net/~leaningtech-dev/+archive/ubuntu/cheerp-nightly-ppa

Would this allow you to move forward?

Otherwise I can put together the build from source instructions.

zyz9740 commented 2 years ago

When I downloaded the nightly ppa and install latest cheerp-core, an error occured. The log is as below:

dpkg: error processing archive /tmp/apt-dpkg-install-jUgf4D/3-cheerp-musl_1657105974-1~focal_amd64.deb (--unpack):
 trying to overwrite '/opt/cheerp/include/alloca.h', which is also in package cheerp-newlib 2.7-1~focal
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Selecting previously unselected package cheerp-libcxx-libcxxabi.
Preparing to unpack .../4-cheerp-libcxx-libcxxabi_1657105974-1~focal_amd64.deb ...
Unpacking cheerp-libcxx-libcxxabi (1657105974-1~focal) ...
dpkg: error processing archive /tmp/apt-dpkg-install-jUgf4D/4-cheerp-libcxx-libcxxabi_1657105974-1~focal_amd64.deb (--unpack):
 trying to overwrite '/opt/cheerp/include/c++/v1/__algorithm/adjacent_find.h', which is also in package cheerp-libcxx 2.7-1~focal
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Preparing to unpack .../5-cheerp-libs_1657105974-1~focal_amd64.deb ...
Unpacking cheerp-libs (1657105974-1~focal) over (2.7-1~focal) ...
Errors were encountered while processing:
 /tmp/apt-dpkg-install-jUgf4D/3-cheerp-musl_1657105974-1~focal_amd64.deb
 /tmp/apt-dpkg-install-jUgf4D/4-cheerp-libcxx-libcxxabi_1657105974-1~focal_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

It is worth mentioning that I have a version 2.7 cheerp installed from ppa:leaningtech-dev/cheerp-ppa. Is it means that the problem occurs when I want to overwrite the old version cheerp ?

However, I think thsi is due to that I have messed up my environment. I will use a cleaner environment to install . Thanks for your fast reply again.

zyz9740 commented 2 years ago

Oh, I use a clean env (a newly installed OS) and install successfully. But with the lastest cheerp, I can still reproduce this bug. Can you take a look again ?

carlopi commented 2 years ago

Currently I'd say cheerp-stable and cheerp-nightly are not compatible.

Probably it's safest to remove cheerp-stable and then install cheerp-nightly, and possibly should be checked by the packages themselves. I have to discuss this with my colleagues.

carlopi commented 2 years ago

And I just tested and I obtain:

first, g_552: 65531
second: g_552: 65531
g_552: 12129
...checksum after hashing g_552 : 40963639

What results are you seeing?

Compiling with latest master (https://github.com/leaningtech/cheerp-compiler/tree/1a58e554cb6eeb8c28ab8f0e4f4ca1b96df89893).

zyz9740 commented 2 years ago

I don't understand why this issue is similar to https://github.com/leaningtech/cheerp-meta/issues/114. And I test issue 114 again using cheerp-nightly and got the right outputfirst, g_552: 65531

first, g_552: 65531
second: g_552: 65531
g_552: 12129
...checksum after hashing g_552 : 40963639

I'm sure you repair that bug but I still confused about this bug. In order to avoid my uploading wrong code, the source code of this case is as below:

#include <stdio.h>
struct a {
  short b;
  unsigned int c;
  int d
};
struct {
  struct a b
} e = {9, 8};
int *f = &e.b.d;
unsigned long g() { *f = 234; }
int main() {
  g();
  printf("%d\n", e.b.c);
}

reproduce.sh

CHEERP=/opt/cheerp/bin/clang
CLANG=gcc
V8=v8

$CLANG random.c -O3 -o random.o
$CHEERP -target cheerp-wasm random.c -O3 -o random_cheerp.js 
$V8 random_cheerp.js > cheerp.out
./random.o > clang.out
diff cheerp.out clang.out
hungryzzz commented 2 years ago

Hi, I just investigated futher why this case would produce bugs. The original bug case is as blow:

#include <stdio.h>
struct a {
  short b;
  unsigned int c;
  int d
};
struct {
  struct a b
} e = {9, 8};
int *f = &e.b.d;
unsigned long g() { *f = 234; }
int main() {
  g();
  printf("%d\n", e.b.c);
}

Firstly, I ran cheerp's components in steps, and got the following LLVM IR before llc(only show a small related part):


%0 = type asmjs directbase %struct._Z1a { i16, i32, i32 }
%struct._Z1a = type asmjs { i16, i32, i32 }
%"struct._Z3$_0" = type asmjs directbase %struct._Z1a { i16, [2 x i8], i32, i32 }

; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn writeonly
define internal fastcc void @g() unnamed_addr #0 section "asmjs" {
entry:
  store i32 234, i32* getelementptr (%"struct._Z3$_0", %"struct._Z3$_0"* bitcast (%0* @e to %"struct._Z3$_0"*), i32 0, i32 2), align 4
  ret void
}

; Function Attrs: noinline nounwind
define dso_local i32 @main() local_unnamed_addr #1 section "asmjs" {
entry:
  tail call fastcc void @g()
  %0 = load i32, i32* getelementptr (%struct._Z1a, %struct._Z1a* bitcast (%0* @e to %struct._Z1a*), i32 0, i32 1), align 4
  tail call void (i8*, ...) @printf(i8* noundef nonnull undef, i32 %0)
  ret i32 0
}

I found that even though the function g and main update the same structure instance, from LLVM IR it looks like it accesses them in different ways, %"struct._Z3$_0" and %struct._Z1a.

And then, I traced the following command using GDB: llc -march=cheerp -o cheerp_test.js -cheerp-linear-output=wasm -cheerp-secondary-output-file=cheerp_test.wasm -filetype obj test3.bc, and got the following LLVM IR before the CheerpWritePass:

; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn writeonly
define internal fastcc void @g() unnamed_addr #0 section "asmjs" {
entry:
  %0 = inttoptr i32 1048588 to i32*
  store i32 234, i32* %0, align 4
  ret void
}

; Function Attrs: noinline nounwind
define dso_local i32 @main() local_unnamed_addr #1 section "asmjs" {
entry:
  tail call fastcc void @g()
  %0 = inttoptr i32 1048588 to i32*  ;
  %1 = load i32, i32* %0, align 4  ; %1 = 234
  ; some unrelated IRs are deleted
  tail call void (i8*, ...) @printf(i8* noundef nonnull undef, i32 %1)
  call void @llvm.stackrestore(i8* %2)
  ret i32 0
}

According to the above LLVM IR, I found that the function g and main try to access the same memory (1048588), but they should be different from the source code(e.b.d vs. e.b.c).

And then, I found that in pass createConstantExprLoweringPass -> function visitConstantExpr -> function partialOffset, function g and main compute the same structure partial offset, 4, so finally they access the same memory.

I don't know if the root cause exists in this pass, and I'm confused why is there a problem when using LLVM data structure DataLayout. Hoping for your answer! Thanks!