open-watcom / open-watcom-v2

Open Watcom V2.0 - Source code repository, Wiki, Latest Binary build, Archived builds including all installers for download.
Other
973 stars 159 forks source link

Regression Test: Full Build Without Doc on Windows 7 32-bit #26

Closed ideafarm closed 10 years ago

ideafarm commented 10 years ago

I am opening this issue as a single conversation thread to report all problems that I experience as a newbie contributor while becoming familiar with building OW on my 32 bit Windows 7 ASUS netbook. I will post here on the assumption that any problems that I encounter are due to my lack of familiarity, and that I will solve these problems on my own in due course. If it appears that I have encountered a true defect in the download, I will open a separate issue.

I have closed the two issues that I opened because I have gotten beyond both of them. At this moment, I am working on a problem building

\open-watcom-v2-master\bld\cpplib\contain

All subdirectories (mc ml mh mm ms) exhibit exactly the same failure. Running wmake fails when trying to build:

Error(E42): Last command making (wcskip.obj) returned a bad status

I have set verbose=1. I then copy the exact command to the comand prompt, which executes aok. I execute wmake again, which now works and chugs along until:

Error(E42): Last command making (xobjs\wcskip.obj) returned a bad status

I do the same workaround again, which executes aok. I execute wmake again, which now succeeds all of the way to the end.

This problem is not just for "contain". It also occurs for the next project in the build sequence. My next step will be to try to figure out how to see the environment information that is being passed into the compiler by wmake and try to identify how that environment differs from what is passed into the compiler when the same command line is executed by cmd.exe.

I don't expect anyone to help me with this unless you think that I am encountering a real defect in the download.

jmalak commented 10 years ago

Please attach build.log here to be able analyse potential source of problem.

ideafarm commented 10 years ago

Wilco. I just started a build after deleting the master, downloading a fresh copy, and reinstalling. This time, I avoided making any unnecessary changes to setvars.bat. DOSBOX is also now, for the first time, installed in a path that contains no spaces. Doc is suppressed. I expect to post the results within an hour or two.

ideafarm commented 10 years ago

CNR ("could not reproduce"). The build is building iostream, which is beyond the failure point. The failure involved running the 16 bit compiler, so perhaps was caused by the use of \system32\dosx.exe, which is included with Windows 7, rather than DOSBox.exe, which I installed.

BUG: setvars.bat, as it is currently coded, will not work on Windows 7, even if DOSBox.exe is installed into a path that contains no spaces and OWDOSBOX is set correctly. Since dosx.exe exists, a goto skips right over the set OWDOSBOX line. Conjecture: the build is working for me now because I removed that goto line.

ideafarm commented 10 years ago

When I installed DOSBox, I edited its configuration file to mount my system drive and my content drive. This caused the build to fail at:

\bld\browser\nt386

apparently because the configured mounts conflicted with the mounts specified in the makefile. Commenting out the configured mounts, so that DOSBox starts without any mounts, fixed the problem.

ideafarm commented 10 years ago

The build, without building documentation, appears to have ended successfully. It does not give any indication of success; it just ends. The last two lines displayed in the console are:

========================= 18:29:07 i:\ow\ow\bld\wgml ==========================
============================ 18:29:07 i:\ow\ow\bld ============================

I will now run build.bat again after turning documentation and debugging on.

ideafarm commented 10 years ago

Build failed. Failure was at same point as before. (See E42 wcskip.obj above.) I have just deleted the development directory ow\ow, recreated it by unzipping the download, recreated x-setvars.bat, and restarted the build. This time, debugging is OFF and documentation is OFF. (Those were the settings for the successful build.)

jmalak commented 10 years ago

Sorry, debug build is not supported. Don't use OWDEBUG_BUILD for standard build.

ideafarm commented 10 years ago

Ok. Good to know that the defect is known. That will save time. The build with debug OFF and doc OFF succeeded. I will now rename (to archive) the resulting development directory, recreate it by unzipping the download, and repeat, this time using debug OFF and doc ON.

jmalak commented 10 years ago

You wrote that you modify setvars.bat. It is not good practise. You should create your own copy of setvars.bat uder different name and modify what you ned and use this "setvars.bat" file to setup build environment.

ideafarm commented 10 years ago

I TOTALLY appreciate that you are watching so closely. Both you and the quality of the download are AWESOME. My stress level has fallen to nearly zero as a result of having your help. I did notice that warning and have been heeding it. I make a copy called x-setvars.bat and modify that, not the original.

At some point, I would like to become acquainted with you and with anyone else who has a serious interest in OW. I would like to introduce you to the part that OW plays in my vision of the future of the software craft, and explore how that vision might relate to the vision that you and others have for OW.

ideafarm commented 10 years ago

20140116 1710 screen build stopped in clib_dos wgmlopts tmp

ideafarm commented 10 years ago

20140116 1710 build log stopped in clib_dos wgmlopts tmp

This is an image of the last lines of build.log. The build is stopped while using DOSBox to make a document. The CPU is idling at 10%. Nothing has happened for at least 2 hours. I will take a crack at figuring this out tomorrow, but comments to point the way are welcome. I will upload build.log if I can figure out how that's done inGitHub.

ideafarm commented 10 years ago

20140116 2044 screenshot docs building aok

Ok. Docs appear to be building AOK (see above). I got it to work by restoring the line that I had removed to get the build to use DOSBox.exe (I installed) rather than dosx.exe (included with Windows 7). dosx.exe runs so much faster than DOSBox.exe that it ain't funny. I will leave this running overnight.

ideafarm commented 10 years ago

Am over 12 hours into building docs and still haven't finished the dos docs. (wipfc is being built.)

ideafarm commented 10 years ago

Am over 15 hours and no further progress is apparent. Does anyone have a build log for a full build or for a documentation build? It would be very helpful to have a log of a successful build for each development host (Windows 7, Linux, etc.) included with the download so that a new builder can know how long each step is supposed to take.

ideafarm commented 10 years ago

20140118 0756 screenshot doc build stalled

The documentation build failed to show any progress overnight. I just killed it after taking the above screenshot. Since no one has responded to my request for a Windows 7 32-bit build log, I will proceed on the assumption that the build really doesn't work, for reasons that have nothing to do with my particular setup.

IMHO, the code and build environment should be "locked down" until they are fully and systematically regression tested. Changing code introduces bugs. That is a sure way to kill any open source project. If the current download won't even build without blood being spilled on Windows 7 32-bit, then this fork is not ready for prime time. The beautiful Watcom C/C++ and Fortran compilers do not deserve to die that way.

ideafarm commented 10 years ago

Silly me. I just discovered that there is a separate log for documentation. At the end, I found this:

===== dos i:\ow\ow\docs\dos-clib/clib_dos =====
C:\pgmfiles\DOSBox-0.74\DOSBox.exe -c "mount c i:\ow\ow\docs" -c "mount d ." -c "mount e i:\ow\ow\bld" -c "d:wgml.bat" -noconsole
Error(E14): Cannot execute (wmake): No error
Error(E42): Last command making (clib) returned a bad status
Error(E02): Make execution terminated

I am running builder docs again and it is beyond that point. Don't know why E14 occurred, but it's CNR.

I am totally new both to building OW and to using GitHub. If I am violating any etiquette or other rules or uploading too many posts or images, please do not hesitate to give me a clue.

jmalak commented 10 years ago

Bellow is link to full documentation build log from Windows 2003 Server (32-bit), it uses Windows DOS emulator (DOSX.EXE). http://www.malakovi.cz/jiri/ow/download/doc_build.log I suggest to use DOSX.EXE to identify problem. You must take into account that wgml.exe is available as binary only for DOS and OS/2. Sources for wgml.exe are not available. The only solution is to fix DOSX.EXE environment on Windows or use DOSBOX with proper configuration (it is very slow due to CPU emulation). I didn't test DOSBOX on 32-bit Wndows I use it for 64-bit Windows and it works. Generaly Documentation you can download from sourceforge (full snapshot of OW build) http://sourceforge.net/projects/openwatcom/files/current-build/ow-snapshot.7z/download

Jiri

jmalak commented 10 years ago

You can test one book build if any problems occure Best is start with PS documentation and setup OWVERBOSE environment variable to 1. you must switch to docs/ps sub-directory and run command (example for clib documentation) wmake hbook=clib If you use DOSBOX, it create batch file to run wgml in working directory that you can check it if it is correct

ideafarm commented 10 years ago

Thanks for the log. It will tell me where I am as I work on each issue that arises. It will tell me how long each step should take (relatively). This exercise will familiarize me with the make files and the build environment.

I don't have an immediate need for the doc. But I do want to verify that I can do a full build of the entire product. It excites me to think that within a few days I will have built the tools that I have been using since 1996. OW isn't just exciting for personal reasons, however. I am an economist and am thoroughly disgusted with how the computer industry and the software craft have evolved over my lifetime. The Microsoft Monopoly has ruined everything. I have an idea about what can be done about that, and OW plays an important role in that plan.

ideafarm commented 10 years ago

Still encountering failures, so I am going to focus on systematically testing and debugging the build, without documentation, on 32-bit Windows. As of today, I have three test machines that will be running tests on OW build 24/7. One is Windows 7. The other two are Windows XP. The initial finding is that on both XP units, a compiler error (recursive include) causes the build to terminate. On the Windows 7 unit, the build stalls repeatedly, but if I kill the dosvdm.exe process each time, the build continues to apparently successful conclusion. The next step is to reinstall Watcom binaries on all three units to ensure that exactly the same binaries are being used to build.

jmalak commented 10 years ago

It looks like you have something wrong in your environment. Please send me copy of your "setvars" script and build log to identify root cause. I am regulary run build on Windows 2003 32-bit, Windows 7 64-bit, OpenSuse 12.3 Linux 64-bit, Open BSD 64-bit without similar problems. Note: You need to use OpenWatcom 1.9 for building, no OpenWatcom 2.0.

ideafarm commented 10 years ago

OW 2.0 is installed on all machines. Thanks for the tip. I will install OW 1.9 and try. This is very interesting behavior, so I might want to debug to see if I can get OW 2.0 to work once I get OW 1.9 to work.

jmalak commented 10 years ago

You must always use previous OW version for building new one. OW 2.0 can have some bugs especialy 64-bit version.

ideafarm commented 10 years ago

The earlier that bugs are discovered, the better. Everyone who is building OW should be using the latest and bloodiest build. Buggy code should never be used as the baseline for new development work. If there are bugs, you fix them, period. Now. Not later. You (and me). Not someone else. Each coder cleans up his or her own mess. Today's dinner dishes get washed today. Not tomorrow.

pchapin commented 10 years ago

In principle the compiler should be buildable with the previously released version to create a proper chain of buildability from the past into the future. If the compiler makes use of features that only it can compile that means the released binaries must necessarily have a dependency on a never released version of the compiler. That's undesirable.

On the other hand it's a good workout for the new compiler to ask it to build itself and thus good for testing. Certainly if faults are manifest during the build of itself they should be fixed. However, it does take time to fix faults and thus, for a while, the compiler may not be able to build itself cleanly.

One issue for Open Watcom is that there hasn't been a new release in a long time. Requiring the current compiler to be buildable with the previous (now becoming ancient) release is asking progressively a lot.

ideafarm commented 10 years ago

Mr. Chapin (Peter?), thanks for engaging me. Respectfully, your "con" argument contains a logical flaw. Consider a sequence of revisions {1,2,3,...,R-1,R,R+1,...} where Rev. 1 is the last publicly released binaries. We agree that Rev. R cannot be built using Rev. R. The current practice apparently is to continue to use Rev. 1 indefinitely to build R, R+1, R+2, ... . I advocate that everyone immediately adopt the new policy of always building R with R-1. To reduce the burden of continually installing the latest weekly binary release, people should update their build environment randomly, say every 4 weeks. That way, each revision R gets tested on R-1, R-2, R-3, and R-4. The important point is that at least one person should be test building each and every revision R with its immediate predecessor R-1.

There are, AFAIK, no reasons to not be doing this, to the extent that it fits with each person's style and personality.

jmalak commented 10 years ago

Take into account that OW 2.0 is in beta phase. There are lots of code changes due to porting to 64-bit. OW is not yet fully ported to 64-bit (by example C++ compiler on 64-bit Linux is buggy). In this phase I am using only standard automated tests on platform available for me which don't test everything only main components and features on my platforms. Any bug report is welcome. Anyway OW 2.0 should be buildable by OW 2.0 until OW 2.0 tools and run-time libraries are compatible with OW 1.9. If there is a problem it must be resolved.

ideafarm commented 10 years ago

20140122 0656 screenshot build stopped browser nt386

Stylized facts: OW 1.9 is installed. In build of browser/nt386, immediately AFTER executing "bwrc -q -p wbrwpm.res wbrw.exe". bwrc is completing aok but whoever launches the bwrc process stops doing anything. Workaround is to kill dosvdm, which causes the next build (os2386) to stop in the same place. Killing dosvdm again causes the build to proceed. The build eventually proceeds to completion. The behavior is the same when OW 2.0 is installed. This behavior is observed during every trial; it happens every time. The platform is Windows 7 32-bit. Documentation is suppressed. When the same x-setvars.bat is run on the same virgin master on Windows XP, the build aborts rather than hangs during the build of browser/nt386. On that platform, the compiler complains about an infinitely recursive #include. The error is not spurious; the generated file really does contain an include of itself.

ideafarm commented 10 years ago

20140122 0756 screenshot build failed idedemo threed nt

Disregard "eventually proceeds to completion". Also, "dosvdm" should be corrected to "ntvdm". Repeatedly killing ntvdm moves things along, but eventually the build fails due to file project.mk not found.

jmalak commented 10 years ago

Your behavior is strange, because NTVDM is used only by wgml utility, not by any other tools. They must be OS native build by bootstrap phase. It looks like bwrc is bugy some way on your build. Please check build/bin directory for files date time, if it is realy new build or something old, also check format of executable, they should be PE no DOS EXE with stub for DOS4GW. generaly you should run clean script before run build script to clean delete all generated objects and build tools.

jmalak commented 10 years ago

Please check environment variables PATH and variables used by OW (they start with OW..) after running your "setvars" script. Bellow is copy of variables content from my 32-bit Windows build system OWBINDIR=c:\dev\open-watcom-v2\build\bin OWDEBUGBUILD=0 OWDEFAULT_WINDOWING=0 OWDEFINCLUDE=c:\dev\ow19\h;c:\dev\ow19\h\nt OWDEFPATH=c:\dev\ow19\binnt;c:\dev\ow19\binw;....... OWDEFWATCOM=c:\dev\ow19 OWDOCBUILD=0 OWDOCQUIET=1 OWGHOSTSCRIPTPATH=C:\Program Files\gs\gs9.07\bin;C:\Program Files\gs\gs9.07\lib OWHHC=c:\dev\winhc\hhc.exe OWOBJDIR=binbuild OWROOT=c:\dev\open-watcom-v2 OWSRCDIR=c:\dev\open-watcom-v2\bld OWUSENATIVETOOLS=0 OWVERBOSE=1 OWWIN95HC=c:\dev\winhc\hcrtf.exe Path=c:\dev\open-watcom-v2\build\bin;c:\dev\open-watcom-v2\build;c:\dev\ow19\binnt;c:\dev\ow19\binw;.....

ideafarm commented 10 years ago

The problem appears to be whoever launches bwrc. While still hung, I manually open a command prompt, change to the directory where the problem is occuring, and manually enter the same bwrc command, and it works aok, changing the timestamp of the target exe file as expected. Who launches bwrc? What does the launcher do immediately after bwrc exits?

jmalak commented 10 years ago

As we discused by e-mail the problem is not with bwrc, but with wgml command (DOS application) which fail on your installation. Because you kill ntvdm with crashing wgml program, it stop on next call of wgml, which is first command for next target browser build and you see in log last succesful command, it is call of bwrc.

Anyway you should read redme.txt file in root of OW source tree about OW building system, to better understand how the build process works. The main tool builder is traversing build tree and use various rules to decide what and how it should be build (command line parameters for builder command, builder.ctl per project, makefiles per target, etc.).

ideafarm commented 10 years ago

I am definitely beyond the DOS / BWRC issue. Build fails, and this looks like a true bug rather than a consequence of my build environment:

=================== 21:24:52 i:\ow\ow\bld\idedemo\threed\nt ===================
bide2mak -r nt_3d.tgt -c i:\ow\ow\bld/ide/cfg/nt386/ide.cfg
ide2make: Error in 'idex.cfg', line 28, at 'HtmlHelp'
sed -f "../../convtool.sed" -f "../../convnt.sed" nt_3d.mk1 >temp.mk1
sed: can't open nt_3d.mk1
cp temp.mk1 nt_3d.mk1
reading file temp.mk1 (0 bytes)writing file nt_3d.mk1         0 bytes, 1 files written in 0.00 seconds (dump 1)
wmake -i -h -f project.mk
Error(E32): Opening file (project.mk): No such file or directory
Error(E02): Make execution terminated
Error(E42): Last command making (nt_3d.dll) returned a bad status
Error(E02): Make execution terminated
<pmake -d build         -h> => non-zero return: 2
Build failed

i:\ow\ow>

The complaint is that file "project.mk" cannot be found. No file of that name exists anywhere in ow\ow. I also searched the virgin unpack of master; no such file there either. File "master.mif" in idedemo uses "project.mk" conditionally if variable targ_file is not defined. Within idedemo, targ_file is defined by all of the files "makefile" for src, but is NOT defined in ANY of the files "makefile" for threed. I propose that the bug here is that targ_file should be, but is not, defined in the files "makefile" for threed, and that this bug has escaped your notice because it occurs only conditionally.

pchapin commented 10 years ago

The file 'project.mk' is generated during the build process. The problem appears to be that the generation of it fails. It looks like ide2make is having a problem with idex.cfg in bld\ide\cfg\nt386. Looking at that file on line 28 (from the error message) I see

HelpFile ide.hlp

I'm guessing this is with the suppression of the IDE help. If so, then ide.hlp didn't get built and now, as a consequence, the idedemo project also won't build. Maybe you also need to suppress idedemo. Is it really true that DOSBox doesn't work on your system?

ideafarm commented 10 years ago

As Jiri suggested, DOSBox works fine on 32-bit Windows 7, and is needed since DOSX exhibits or triggers a hang. The only problem with DOSBox is that it is too slow, so builds that enable documentation are impractical.

No help files .HLP exist other than the 10 .HLP files which exist in the master zip.

I have modified bld\builder.ctl to build idedemo if and only iff OWDOCBUILD 1. Thanks to your discussions, which I have followed with interest, and the stunning quality of the OW master, I am quickly becoming familiar with OW and expect to soon begin regression testing and debugging. My initial focus will be to ensure that the next Windows 7 prospective contributor will shed zero blood with an idiot proof master that will give him a clean build on the first try. Then I will focus on WDW, which is maddeningly unstable and dysfunctional, even though it is much improved over 1.9.

For the foreseeable future, my focus will be on debugging, since IdeaFarm (tm) Operations urgently needs OW to become rock solid and completely and correctly functional on Windows 7. This will complement Jiri's enthusiasm for enhancing OW, the prospect of which excites me greatly. My own urges to developing new code will continue to be directed toward IPDOS (tm).

jmalak commented 10 years ago

It is bug in idex.cfg file which is generated during build process. It is strange that on your Windows 7 32-bit installation it is properly diagnosed but on my Windows 2003 it isn't. I must investigate why this happen and correct it. As first step I will fix mistake in idex.cfg, but there is another bug which mask previous bug.

Jiri

I am definitely beyond the DOS / BWRC issue. Build fails, and this looks like a true bug rather than a consequence of my build environment:

=================== 21:24:52 i:\ow\ow\bld\idedemo\threed\nt ===================
bide2mak -r nt_3d.tgt -c i:\ow\ow\bld/ide/cfg/nt386/ide.cfg
ide2make: Error in 'idex.cfg', line 28, at 'HtmlHelp'
sed -f "../../convtool.sed" -f "../../convnt.sed" nt_3d.mk1 >temp.mk1
sed: can't open nt_3d.mk1
cp temp.mk1 nt_3d.mk1
reading file temp.mk1 (0 bytes)writing file nt_3d.mk1         0 bytes, 1 files written in 0.00

seconds (dump 1) wmake -i -h -f project.mk Error(E32): Opening file (project.mk): No such file or directory Error(E02): Make execution terminated Error(E42): Last command making (nt_3d.dll) returned a bad status Error(E02): Make execution terminated <pmake -d build -h> => non-zero return: 2 Build failed

i:\ow\ow>

The complaint is that file "project.mk" cannot be found. No file of that name exists anywhere in ow\ow. I also searched the virgin unpack of master; no such file there either. File "master.mif" in idedemo uses "project.mk" conditionally if variable targ_file is not defined. Within idedemo, targ_file is defined by all of the files "makefile" for src, but is NOT defined in ANY of the files "makefile" for threed. I propose that the bug here is that targ_file should be, but is not, defined in the files "makefile" for threed, and that this bug has escaped your notice because it occurs only conditionally.


Reply to this email directly or view it on GitHub: https://github.com/open-watcom/open-watcom-v2/issues/26#issuecomment-33150693

ideafarm commented 10 years ago

Mission accomplished: Build completed AOK on 32-bit Windows 7.

I have edited and "prettied up" setvars.bat and have just launched a test of it by deleting the ow build directory and unzipping master. When the test build completes, I will send the proposed new setvars.bat as an email attachment to Jiri and Peter for informal review. If approved, I will use it as a learning exercise as my first experience contributing to an open source project. At this moment, I don't really know what a "pull request" is and will be learning how to do all of that using the GitHub system.