Closed mertcandav closed 4 months ago
Here is the more information about that.
We tried again to create a workflow for Windows. Meanwhile, Jule's current commit was 512978d224
, and IR's current commit was 9ab7a750fc
. The attempt failed again. Again a fail exit code. This time the exit code was 1
.
Using Clang is not possible due to MSVC. Using windows-2019
runner image did not fix the problem and caused compilation errors. windows-latest
, which is currently windows-2022
, can compile JuleC IR. But even a simple command julec version
exits with code 1
. So JuleC Version
step is failed.
We used Wa,-mbig-obj
to avoid File too big
problem with GCC.
Our workflow build_windows.yml
file:
name: Build (Windows)
on: [push, pull_request]
jobs:
build:
runs-on: windows-latest
steps:
- uses: actions/checkout@v3
- name: Get latest IR
run: |
curl -o .\ir.cpp https://raw.githubusercontent.com/julelang/julec-ir/main/src/windows-amd64.cpp
- name: Compile Latest JuleC IR
shell: cmd
run: |
mkdir .\bin
g++ -Wa,-mbig-obj -O0 --std=c++17 -w -o .\bin\julec.exe .\ir.cpp
git update-index --add --chmod=-x .\bin\julec.exe
- name: JuleC Version
run: |
.\bin\julec.exe version
- name: Build JuleC
shell: cmd
run: |
.\bin\julec.exe -t .\src\julec
g++ -Wa,-mbig-obj -O0 --std=c++17 -w -o .\bin\julec.exe .\dist\ir.cpp
git update-index --add --chmod=-x .\bin\julec.exe
We tried this with same JuleC codes and IR on our amd64
machine running Windows 10 and the compiler compiled from IR worked as expected. We still don't have clear data on why GitHub Actions build is problematic.
Additionally, on our own machine, GCC does not have a File too big
problem, so we were able to compile without using Wa,-mbig-obj
.
Maybe the version command doesn't work because JuleC has not been built yet? 🤔
We're pretty sure it was built. In any compile problem, the workflow does not proceed to the next step. But still, just to be sure, I executed a dir
command and verified julec.exe
is where it should be.
We have more research and some findings on the subject. We're almost certain that the GCC in our build on a native Windows machine is actually an alias executable file for Clang, hence the wrong results. To fix this we got a MinGW GCC and tried again. The file too big
problem also occurred on our machine, this seems normal and fixed with -Wa,-mbig-obj
flag.
The interesting part is that the compiler compiled from the IR code still works correctly. Does not terminate immediately with exit code 1
when executed like GitHub Action machines. The julec version
command can be executed successfully. But when we tried to transcribe the compiler's own source code, we clearly got an exit code of 11
ie a segmentation fault
occurred. We don't have enough evidence to fully understand the issues with GCC. With MinGW LLVM Clang, we were able to achieve a seamless Jule development experience on Windows. The compiler didn't have any problems. We don't know exactly why GCC is having trouble, maybe even a compiler bug.
So it looks like we have to update GCC support as partial support state. There is detailed information about this on the relevant manual page.
GitHub Actions is still a puzzle. We compiled it from IR code with MinGW LLVM Clang which allowed us to have a smooth experience on our local machine. We expected a smooth experience, but we may have found a strong finding that the problem is not related to the compilers. Even the executable obtained with the Clang compiler, which we have had smooth experience with, has the same problem as with GCC. The program terminated immediately with exit code 1
when executed.
This issue we're having on GitHub Actions machines may not have anything to do with us.
We can also use gdb
or OnlineGDB to have more debugging options and information to see what caused the segmentation fault issue. Let me know if you need any further help. Thanks. 🙂
We can also use
gdb
or OnlineGDB to have more debugging options and information to see what caused the segmentation fault issue. Let me know if you need any further help. Thanks. 🙂
GDB did not provide any meaningful help. I suspect this has more to do with the compiler than with the Jule API. No problems when compiled with Clang. When I test and debug it on my local machine, I see that this is due to the use of a delete
keyword when copying the jule::Any
type. But having the release memory there shouldn't be a problem. The memory address trying to be freed looks weird. Exactly: 0xabababababababab
, if this address has a special meaning, please let me know.
When I checked the stack trace I couldn't see anything that could cause this. Additionally I must say that the my Windows machine is weak and the compile times are really long and it is difficult to debug. Therefore, the process cannot progress quickly on my machine.
If you do any analysis, debugging, research and similar thing on this issue, please share your findings with us.
Thanks.
Here is the new news.
This isn't just a Windows problem. Therefore, I will update the title of this issue accordingly. This problem seems to occur on Linux and macOS as well. As far as tested on macOS with the latest updates, GCC compilation was successful and the produced executable worked as expected. This is how I observed that GCC compatibility has increased.
Seems like help is needed to understand if compatibility is fully achieved. I tried to compile on GitHub Actions. But julec version
command exits with code 1
. Looks like this needs testing locally on a Windows machine. We don't have enough information yet, but significant progress towards GCC compatibility looks good.
I tested GCC support on VM Fedora Linux 38 Workstation Edition. Everything seems ok. Clang and GCC works as expected, no any problem.
Clang version: clang version 16.0.6 (Fedora 16.0.6-3.fc38)
GCC version: g++ (GCC) 13.0.1 20230401 (Red Hat 13.0.1-0)
I used latest IR (version 54a6661525) and master source tree (hash 54a6661525) of Jule.
I tested GCC support on VM Fedora Linux 38 Workstation Edition. Everything seems ok. Clang and GCC works as expected, no any problem.
Clang version:
clang version 16.0.6 (Fedora 16.0.6-3.fc38)
GCC version:g++ (GCC) 13.0.1 20230401 (Red Hat 13.0.1-0)
I used latest IR (version 54a6661) and master source tree (hash 54a6661) of Jule.
That's great news! 🎉 Should we try again with Windows?
I tested GCC support on VM Fedora Linux 38 Workstation Edition. Everything seems ok. Clang and GCC works as expected, no any problem. Clang version:
clang version 16.0.6 (Fedora 16.0.6-3.fc38)
GCC version:g++ (GCC) 13.0.1 20230401 (Red Hat 13.0.1-0)
I used latest IR (version 54a6661) and master source tree (hash 54a6661) of Jule.That's great news! 🎉 Should we try again with Windows?
I created a Windows build CI on my own Jule fork to see if the issues were resolved. Very strange, but the problem seems to occur when std::stringstream
is used. When I delete relevant statement, the program compiles and execution is successful, otherwise exit code 1
continues. I don't know if std::stringstream
is directly part of the problem, but it's obviously something that's causing the problem. Looks like this needs a look.
I tested GCC support on VM Fedora Linux 38 Workstation Edition. Everything seems ok. Clang and GCC works as expected, no any problem. Clang version:
clang version 16.0.6 (Fedora 16.0.6-3.fc38)
GCC version:g++ (GCC) 13.0.1 20230401 (Red Hat 13.0.1-0)
I used latest IR (version 54a6661) and master source tree (hash 54a6661) of Jule.That's great news! 🎉 Should we try again with Windows?
I created a Windows build CI on my own Jule fork to see if the issues were resolved. Very strange, but the problem seems to occur when
std::stringstream
is used. When I delete relevant statement, the program compiles and execution is successful, otherwise exit code1
continues. I don't know ifstd::stringstream
is directly part of the problem, but it's obviously something that's causing the problem. Looks like this needs a look.
That's very strange. Are there any alternatives to std::stringstream
that should be used to prevent this problem? We should still take a look, though, there's most likely another problem. I'll take a look and see if there's anything obvious.
I'm not sure about that, probably the issue is not the std::stringstream
. But Clang's executables are works fine, GCC builds are not, which is increases complexity of the problem. I'm just wondering whether this issue is a bug of GCC.
Please share with us if you found something about this issue after your investigation.
Latest situation;
The windows-ci
branch have a Windows [GCC] CIs with GitHub Actions and works well except one known issue. I don't know how it works now, the problem is still unknown. The developer ci has compilation steps for Windows with GCC, LLVM Clang is not preferred because has additional compilation errors.
I tested with another local machine using GCC on Windows. Unlike GitHub Actions, no any change on local machine. Still same problem exist, no progress. I don't have any idea what makes different GitHub Actions but it seems to work.
As I said, GitHub Actions have one known problem; console write call is not works. Jule uses WriteConsoleW
function of Windows API which is provided by windows.h
header. As far as I tested, this function works well on all tested systems. As far as I know, GitHub Actions uses UTF-8 codepage by default (Windows uses UTF-16 by default) and I confirm this with testing Unicode characters via simple printf
call. I changed codepage to UTF-8 on local machine and then tested it but the WriteConsoleW
function still works well. So I don't have any clear idea about the GitHub Actions problem.
I just investigate the original problem a bit more, no progress. I just write simple program like this:
#include <stdio.h>
#include "api/utf8.hpp"
int main() {
printf("hello world\n");
return 0;
}
The example program above will not prints while including api/utf8.hpp
header. I modified relevant function declarations and definitions on this file. Finally, I stuck in the utf8_push_rune_bytes
function's body. When I call dest.push_back
method, the problem occurs, even called like dest.push_back(0)
. And this functions are not called anywhere, really. This seems absurd to me. I really suspect there is a bug in GCC causing this problem. If the problem related with Jule. Really wondering what is it.
Update for latest situation:
The example program above will not prints while including api/utf8.hpp header. I modified relevant function declarations and definitions on this file. Finally, I stuck in the utf8_push_rune_bytes function's body. When I call dest.push_back method, the problem occurs, even called like dest.push_back(0). And this functions are not called anywhere, really. This seems absurd to me. I really suspect there is a bug in GCC causing this problem. If the problem related with Jule. Really wondering what is it.
Alright, I discovered the problem. Probably my GCC installation is corrupted, same tasks are good after clean installation. Compiles successfully any program now.
I tested with another local machine using GCC on Windows. Unlike GitHub Actions, no any change on local machine. Still same problem exist, no progress. I don't have any idea what makes different GitHub Actions but it seems to work.
After fixing the compiler problem, I tested again and here is the good news: GCC support looks good! Everything works as expected, even using bootstrapped compiler which is compiled with GCC. No observations of GCC and Clang behavioral differences.
But we have a new problem. On Windows, GitHub Actions is failing when calling WriteConsoleW
function of Windows API. Function returns false
which is means failed. But this problem is not relevant with this issue. Therefore, I will open a new issue and I share here, then close this issue.
But we have a new problem. On Windows, GitHub Actions is failing when calling WriteConsoleW function of Windows API. Function returns false which is means failed. But this problem is not relevant with this issue. Therefore, I will open a new issue and I share here, then close this issue.
The relevant issue is: #107
Description
We tried to create a CI for Windows. It was a simple build CI. It would build JuleC from the IR code and rebuild and compile the latest JuleC from source. But we had some problems.
This CI gets the latest
windows-amd64
with CURL from Julec-IR repository and compiles via GCC withO3
,-w
,-Wa,-mbig-obj
and C++17. After compiling the IR code obtained with CURL, a simple command is executed for testing. A simplejulec version
command is then executed. However, the execution of the program results in failure.We left only the header files and an empty entry point, as we thought this could be caused by some algorithms in the IR code. Then we removed all header files. It was the inclusion of the API that was causing the problem. However, we had a program that did nothing. There was an empty entry point. From what we analyzed, the API did not have code that would lead to the execution of an algorithm that could cause problems when just included. So the program wasn't really doing anything as far as we know.
This issue could be a minor overlooked bug, a simple programming error in the API, or something that has nothing to do with us directly, such as a GCC compilation issue. We tried to do various things to understand this problem, but we couldn't find a rational point to start fixing it. Needs more research.
Expected behavior
Program should execute as expected.
Current behavior
Each execution's result is
Process completed with exit code -1073741511
or something like that.Additional information
The current situation when we do this:
79c36b2e9e
9ab7a750fc