fegennari / 3DWorld

3D Procedural Game Engine Using OpenGL
GNU General Public License v3.0
1.17k stars 92 forks source link

Compilation, Makefile and README.linux #2

Closed Lecrapouille closed 5 years ago

Lecrapouille commented 5 years ago

Edit: I rephrase because I was in urge and I had not time to read myself the 1st time.

Compilation problem

SHA1 2f36272844a0d135308846cc0448c24a52ad02de The compilation is broken with gli. I tried the currently present folder then the git cloned version but got the same problem:

In file included from dependencies/gli/gli/format.hpp:6:0,
                 from dependencies/gli/gli/gli.hpp:39,
                 from src/image_io.cpp:25:
dependencies/gli/gli/type.hpp:42:24: error: ‘qualifier’ has not been declared
  template <typename T, qualifier P>

I have a Debian 9.8, amd 64, and g++-6.0. I dunno yet how to fix this. Note that glib compiles well by itself.

Some confusing points inside the README.linux:
Some confusing points inside the Makefile

all: $(TARGET)

%.o : %.C $(BUILD)/%.d $(CPP) $(CPPFLAGS) -MMD -c $< -o $(abspath $(BUILD)/$@) %.o : %.cc $(BUILD)/%.d $(CPP) $(CPPFLAGS) -MMD -c $< -o $(abspath $(BUILD)/$@) %.o : %.cpp $(BUILD)/%.d $(CPP) $(CPPFLAGS) -MMD -c $< -o $(abspath $(BUILD)/$@)

$(OBJS): | $(BUILD) $(BUILD): @mkdir -p $(BUILD)


Explainations:
Firstly, feel free to rename BUILD variable, it indicates the folder holding your *.o files. You have to refer this folder to VPATH. The `all:` rule now simply depends on your binary file. For compiling your c(pp) files you have to add the `-o`option else *.o are created next to your Makefile instead of inside `$(BUILD)` folder. Note that you also have to depend of `%.d` files, so added them in the dependency rule. I added `abspath` but this is optional but this helps coverage tool like `gcov` to find the source. Finaly `$(OBJS): | $(BUILD)`replace your if then else. `$(BUILD)` rule is called before  building `$(OBJS)`. Also note that you no longer need of `../../` for TARGA, GLUI ... 

* Unfortunaltly there are still problems with `*.d` files: they are created at the root of the directory and makefile will recompiling all c(pp) files everytime. So replace your code based on `touch/cat` commands by this code which does the same thing:

$(BUILD)/%.d: ; .PRECIOUS: $(BUILD)/%.d -include $(patsubst %,$(BUILD)/%.d,$(basename $(OBJS)))


With all these changes you simply can go to the root of the project and type make! No more ln or mv :)

* Make clean can now simply remove the folder $(BUILD).
* I attached my Makefile to this issue. [makefile.txt](https://github.com/fegennari/3DWorld/files/3244687/makefile.txt) Note: because of github I had to add the `.txt` extension else it refuses to attach it.

As said previously I could not finish compiling because of gli so I cannot  check if my Makefile is fully correct. For more information you can see my Makefile [repo](https://github.com/Lecrapouille/MyMakefile).

##### chmod
* There are plenty of ASCII files in executable mode. You should fix their rights with a `chmod 644` git manages the chmod rights.
fegennari commented 5 years ago

Thanks for the feedback. I admit that have more experience with C++ programming and algorithms than makefiles. This makefile was copied from another OpenGL project of min and modified. That makefile was copied from something I did at work, which originally used the makefile from https://github.com/moskewcz/boda. Also, I do most of my 3DWorld development in Windows with Visual Studio.

I just now realized that GLUI isn't actually needed as a dependency, and I removed it. I also removed all of those unused TARGET2/OBJS2/LIB_TARGET/LIB_OBJS variables and targets because they're unused here. That simplifies the makefile quite a bit.

For your GLI error, I've never seen that one. I'm using gcc-7.4 on Ubuntu 18.04. I did try your makefile. I don't get that GLI error, but I do get another error. I'll have to figure that out later.

Maybe the problem is that you're trying to use the GLI version from the dependencies directory. That version was configured for Windows/Visual Studio, and I included the libraries as well. Someone was having trouble with the VS build so I just added all of my files rather than trying to step him through the build and setup process of all the dependencies in VS. For my linux build, I put the dependencies back one level, in the same directory as the 3DWorld/ directory. I've updated the linux build instructions to make this more clear, and also simplified it a bit.

For your specific suggestions:

Lecrapouille commented 5 years ago

Thanks for your answer.

Lecrapouille commented 5 years ago

In my first Makefile I simply forget to go inside $BUILD folder before before linking. Fixed!

I also added -Isrc/texture_tile_blend for knowing 3DWorld/src/texture_tile_blend/tlingandblending.h and VPATH for knowing 3DWorld/src/texture_tile_blend/tlingandblending.cpp. Now you can remove your symbolic file 3DWorld/src/texture_tile_blend.cpp. I think now you can remove ..inside their #include which is a bad practice!

Note: You can play with -I and VPATH for selecting and compiling files from different architecture.

Here is the Makefile (with -UENABLE_DDS and all my fixes) makefile2.txt.

fegennari commented 5 years ago

Your changes fix the texture_tile_blend error I was getting. Now your makefile works for me. I did some cleanup (removed GLUI and the unused variables/targets). I have no errors using GLUI for DDS image/texture reading, so I enabled it. I'm not sure why you get that error. Maybe your version of gcc is too old and doesn't support something? Is that "qualifier" syntax part of C++17? There aren't too many test scenes that use DDS textures so you should be able to keep that disabled in your build.

I need to do further testing a cleanup, then I'll commit these changes.

fegennari commented 5 years ago

Also I can't remove the ".." in front of those includes in texture_tile_blend.cpp because I still need that for Visual Studio. I'm not sure what the equivalent of VPATH is there. I'll have to experiment and try to make that work in VS and then I can remove it when it works in both flows.

fegennari commented 5 years ago

Okay, I updated the makefile and linux README. I'll get back to the symlink and include paths later. Is it working for you now? I have no idea where that "jack server" error message is coming from. I don't think it's generated by 3DWorld.

Lecrapouille commented 5 years ago

Maybe due the previous error?

..Error opening model3d file for read: ../models/dragon.model3d
Error reading model3d file ../models/dragon.model3d
fegennari commented 5 years ago

I have no idea what the Jack Server error is. This isn't something I use in 3DWorld, so it must be pulled in as some sort of dependency. Maybe for OpenAL? Sorry, I can't help with that. That commit/SHA1 has nothing to do with this error.

For that shrink_to_fit() change, I guess it needs to be vector::shrink_to_fit(); I don't get that error so it must be some recent change to the C++ template spec. I'll fix it after work today.

The "invalid operation" error may be that your graphics driver doesn't support some feature I'm using. 3DWorld requires OpenGL 4.5 support. What GPU/driver version/OpenGL version do you have? 3DWorld should print the renderer and version info at the very top of the terminal.

Location ID 0 is very low-level, likely one of the first GL calls. The dragon model import error should be nonfatal. If it can't load the model, it just won't draw it. I don't have that model on one of my test machines and I don't get that error. You can try one of the other configs from defaults.txt. Something like config_river is a very simple scene with no 3D models loaded.

Lecrapouille commented 5 years ago

Yep my graphics card halts to OpenGL 4.3 ... I believed up to 4.5 :(

fegennari commented 5 years ago

What card do you have? If I can figure out what functions are missing, maybe I can have 3DWorld automatically check for and disable them. I had to have it auto detect and disable tessellation shaders on one machine.

Lecrapouille commented 5 years ago

From memory, it is a Nvidia GT720. Removing the assert make run the game but I just see blue background color and a "flat moving mixture of pixels" which is supposed to be the scene :( The camera is "moving" the mixture. Also have this warning (maybe endianess issue since I'm on AMD64):

loading textureslibpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile

log.txt Maybe shaders and 3d models are not loaded because of bad relative paths (../) since my changes in Makefile? I'll dive inside the code tomorrow.

fegennari commented 5 years ago

A GTX720 is OpenGL 4.4, which is pretty close. I wonder what's failing? Have you tried the config_river scene?

I've seen that libpng warning at work before. It's a problem with PNG decompression and loss of precision or wrong color space. It might make the colors a bit off, but won't give you a random blob of pixels. It's also not related to endian-ness. All of the machines where 3DWorld can run are likely little endian (x86, x86_46, AMD64, etc.)

Moving the binary from lib/ to obj/ won't break the paths. 3DWorld will search in ".", "..", and a few other locations relative to the binary to find its files. It won't even start if it can't find the base assets. Also, I'm using your updated makefile and it works for me. It's probably something low-level related to the shaders, uniforms transform matrix, etc. What driver are you using? Are you using an Nvidia driver, nouveau, MESA, etc?

fegennari commented 5 years ago

Okay. I attempted to fix that template compile error for shrink_to_fit(). Can you check to see if it works now?

I made most of your suggested makefile changes, except for the the "pkg-config" change, which I don't know how to do. However, the makefile seems to be broken (both before and after this new set of changes). If I change a header file, it won't recompile anything. It says "up to date". Is the path wrong in the .d files? Maybe it should be "../src/" instead of "src/"? I'm not really sure what the problem is. This definitely needs to be fixed.

Now for the "invalid operation" error you're getting. The display() function has a check_gl_error(0) at the top and check_gl_error(11) at the end. Since you're getting the error at "location ID 0", that's probably on the first call to display(). I've never seen that before. There aren't many GL calls made in main() before dropping into glutMainLoop. There's just the basic GL context creation, viewport setup, glClear(), etc. If one of those low-level calls is failing, that's going to be bad. So it's either failing inside GLUT or GLEW init calls, failing setting up the viewport, or maybe failing in a mouse or keyboard callback. I added some check_gl_error(777*) calls in the setup code in main(). See if you get one of these to fail. If so, you can try adding more check_gl_error() calls with unique IDs to try and isolate it down to the failing call. I wish there was a better way to get error codes from GL functions!

One more thing. I get this error when I quit 3DWorld: The futex facility returned an unexpected error code.Aborted (core dumped) I've never been able to figure out what went wrong on shutdown. Do you see this error?

Juris3D commented 5 years ago

Hello all :) I am glad that question about build and usability has been actualized, and by someone with actual knowledge of matter (not like me who could not state questions properly :) ). I am on Windows, i had my "system" how to get this working, but it was weird-ish (we had email exchange before). I see in news there are infinite cities in works? I have not built and tested World for about month or more. Is it good time to try now? :) Thanks for working on this interesting World! Juris.

fegennari commented 5 years ago

Hi Juris! I've added a lot of features in the past month. Feel free to try it out.

This is the first feedback I've ever gotten on the linux build system. I'm glad someone has decided to try it. Linux support was only added a few months ago so it's still in progress.

Lecrapouille commented 5 years ago

I was curious to test this project because when I was student I made a similar project in Delphi.

No biggy for the Makefile, I forget to add these two lines (%.d now contains full path):

DEPFLAGS = -MT $@ -MMD -MP -MF $(BUILD)/$*.Td
POSTCOMPILE = mv -f $(BUILD)/$*.Td $(BUILD)/$*.d

And compile with:

$(Q)$(CXX) $(DEPFLAGS) $(CXXFLAGS) $(INCLUDES) $(DEFINES) -c $(abspath $<) -o $(abspath $(BUILD)/$@)
@$(POSTCOMPILE)

Here the new Makefile makefile3.1.txt

I also remove CPP and suffix (replaced by implicit CXX). I also fix rules to call %.o: %.cpp because of your file extensions not matching your suffix. I completed the pkg-config list. I also replace ':' by ' ' for VPATH because ' ' is /was initialy understood by Window and Linux but ':' is only for Linux and ';' for Windows :(

I also add more compilation flags. You should all of these warnings (mainly non init values). This may fix your futex error? Also to catch them you can call:

valgrind 3dworld

You'll see it complains about several memory corruptions:

==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D6DB: texture_t::calc_color() (Textures.cpp:523)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D772: texture_t::calc_color() (Textures.cpp:523)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D809: texture_t::calc_color() (Textures.cpp:523)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D37D: texture_t::calc_color() (Textures.cpp:511)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D388: texture_t::calc_color() (Textures.cpp:511)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x137B37: float const& std::min<float>(float const&, float const&) (stl_algobase.h:200)
==7060==    by 0x53D4D5: texture_t::calc_color() (Textures.cpp:514)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x137B64: float const& std::max<float>(float const&, float const&) (stl_algobase.h:224)
==7060==    by 0x53D4F0: texture_t::calc_color() (Textures.cpp:514)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D8A4: texture_t::calc_color() (Textures.cpp:526)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 
==7060== Conditional jump or move depends on uninitialised value(s)
==7060==    at 0x53D8B1: texture_t::calc_color() (Textures.cpp:526)
==7060==    by 0x53CE81: texture_t::init() (Textures.cpp:453)
==7060==    by 0x53C4D1: load_textures() (Textures.cpp:299)
==7060==    by 0x1365C7: main (3DWorld.cpp:2203)
==7060== 

and bigger monsters are waiting for you :) with add_all_coll_objects.

Next for tomorrow I got a awfull day today :(

fegennari commented 5 years ago

I took a look at your Ecstasy project. It's definitely similar to 3DWorld's cities in some ways, but there are many differences. It's partially in French and I've never used Delphi, so I'm not sure how much I'm going to understand. I've never seen a project mix OpenGL and DirectX before.

I'll see what I can do with this when I get home. I don't see anything wrong with texture_t::calc_color(). I'm guessing valgrind thinks the texture data is uninitialized. That's supposed to be written to by the various image reading libraries I'm using in a big OpenMP parallel image loading loop earlier in the control flow. Maybe it's an OpenMP related false positive. Maybe one of those image libraries leaves some pixels uninitialized. I may never figure that out:(

The add_all_coll_objects() function is similar. This one loads 3D models of various formats, and also uses OpenMP.

fegennari commented 5 years ago

I made most of your makefile changes, and it seems to work. I also fixed a lot of the warnings. I don't think any of them were serious errors. I get too many warnings from GLI and Targa and I probably can't leave -Wextra in there.

I installed and ran valgrind. I think most of those uninitialized values errors are from variables written in one OpenMP thread and read in the master thread. I haven't found one that looks like a real error yet - though I'm not really sure how to debug them.

I see that futex error that crashes during shutdown: ==10225== Syscall param futex(futex) points to uninitialised byte(s) ==10225== at 0x4E4ACFA: futex_wait (futex-internal.h:61) ==10225== by 0x4E4ACFA: futex_wait_simple (futex-internal.h:135) ==10225== by 0x4E4ACFA: pthread_barrier_destroy (pthread_barrier_destroy.c:51) ==10225== by 0xC377FA6: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0xBC70EB9: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0xC418847: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0xBDBA92A: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0xBDD9122: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0xBC134BF: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0xBBD6932: ??? (in /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so) ==10225== by 0x9F7908E: ??? (in /usr/lib/x86_64-linux-gnu/libGLX_mesa.so.0.0.0) ==10225== by 0x9F8D6D7: ??? (in /usr/lib/x86_64-linux-gnu/libGLX_mesa.so.0.0.0) ==10225== by 0x9F8D83D: ??? (in /usr/lib/x86_64-linux-gnu/libGLX_mesa.so.0.0.0) ==10225== by 0x86B9E41: XCloseDisplay (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0) ==10225== Address 0x1ffefff790 is on thread 1's stack

I'm not sure what I can do about this. Seems like a driver bug?

Lecrapouille commented 5 years ago

Ok start diving inside the code:

You can try to update it. dpkg -L freeglut3 or dpkg -L freeglut3-dev gives me libglut.so.3.9.0. I guess it comes from it because of : nm /usr/lib/x86_64-linux-gnu/libglut.a | grep XCloseDisplay returns XCloseDisplay but the symbol is not present in the .so file and nm /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0 | grep XCloseDisplay gives no result and linking your software like -lxxx tries to load the .so then the .a file if .so failed or is not present).

Lecrapouille commented 5 years ago

I'm fixing some compilation warnings. I got this error:

==19918== Process terminating with default action of signal 2 (SIGINT)
==19918==    at 0x4E45167: pthread_cond_wait@@GLIBC_2.3.2 (pthread_cond_wait.S:143)
==19918==    by 0x110A1277: pa_threaded_mainloop_wait (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.20.1)
==19918==    by 0x6892F98: ??? (in /usr/lib/x86_64-linux-gnu/libopenal.so.1.17.2)
==19918==    by 0x6898006: ??? (in /usr/lib/x86_64-linux-gnu/libopenal.so.1.17.2)
==19918==    by 0x4E3F4A3: start_thread (pthread_create.c:456)
==19918==    by 0x78E7D0E: clone (clone.S:97)
fegennari commented 5 years ago

I'm glad you were able to fix the 'qualifier' error. I'll update the makefile tonight.

For the futex error, I don't call XCloseDisplay(). I don't shut anything down. I expect glut to do the cleanup after it exits glutMainLoop. I'll look into this later and maybe try to strace the process.

For that signal 2 from OpenAL, I have no idea. I don't create any threads in the OpenAL code. That's all internal to OpenAL.

Lecrapouille commented 5 years ago

I almost caught the OpenGL error:

Allocating 46 by 128 by 128 smoke texture of 3014656 bytes.
check_gl_error 201
shadow_map:setup_texture {
setup_texture {
check_gl_error 600
}setup_texture
} shadow_map:setup_texture
check_gl_error 630
check_gl_error 550
check_gl_error 551
check_gl_error 202
check_gl_error 299
check_gl_error 300
setup_texture {
check_gl_error 600
GL Error 1282 at location id 600: invalid operation.
3dworld: /home/qq/3DWorld/src/3DWorld.cpp:197: bool check_gl_error(unsigned int): Assertion `!had_error' failed.

Here some patch fix1.patch.zip fixing some compilation warnings: some destructor shall be virtual. Literal like 1.0 are double you shall write 1.0f instead. There are many shadow param, strict-overflow errors to fix. Here some compilation flags I use:

-Wall -Wextra       \
-Wstrict-aliasing=2 -Wunreachable-code -Wcast-align -Wcast-qual     \
-Wsign-compare -Wsign-conversion -Wsign-promo -Wconversion      \
-Wdisabled-optimization -Winit-self -Wlogical-op            \
-Wmissing-include-dirs -Wnoexcept -Wold-style-cast          \
-Woverloaded-virtual -Wredundant-decls -Wshadow             \
-Wstrict-null-sentinel -Wstrict-overflow=5 -Wswitch-default -Wundef \
-Wno-unused -Wno-variadic-macros -Wno-parentheses           \
-fdiagnostics-show-option -Winline -fasynchronous-unwind-tables     \
-fexceptions -Werror=implicit-function-declaration -pedantic        \
-pedantic-errors -Wformat=2 -Wformat-nonliteral -Wformat-security   \
-Wformat-y2k -Wimport -Winvalid-pch -Wmissing-field-initializers    \
-Wmissing-format-attribute -Wmissing-noreturn -Wpacked          \
-Wpointer-arith -Wstack-protector -fstack-protector-strong      \
-D_FORTIFY_SOURCE=2 -Wswitch-enum -Wunused -Wunused-parameter       \
-Wvariadic-macros -Wwrite-strings -Werror=return-type           \
-D_GLIBCXX_ASSERTIONS -fexceptions -fasynchronous-unwind-tables     \
-Wctor-dtor-privacy -Wnon-virtual-dtor -Wfloat-equal

For fixing errors detected by valgrind, you should use https://scan.coverity.com/

Lecrapouille commented 5 years ago

Here the screenshot of the scene: In addition the generated bmp is "shifted" when opened by GIMP :( screenshot0

fegennari commented 5 years ago

I'm not too interested in fixing all of those lesser warnings. I'll fix the virtual destructors, but I don't care about 1.0 vs. 1.0f, shadowed variables, and overflow checks. It's going to be too hard to avoid doing things like that in Windows, and I don't want to have to fix it every time when I move back to linux. I think it would be good for me to at least look through it. I probably won't add anything other than -Wall to the final makefile. Maybe I can add -Wextra if I can figure out how to suppress all the warnings generated by GLI and Targa. At some point in the future I might switch to something newer such as DEVIL.

What scene is that screenshot from? It looks like the blue sky background color with a bit of random shadows or alpha blended black geometry. Maybe something is wrong with the transform stack? Or near plane/far plane/depth testing? You can try using config_universe.txt. Universe mode uses a different rendering flow and might work better/differently/in-a-way-that's-easier-to-debug.

fegennari commented 5 years ago

Here are some updates:

Thanks for doing all this work. I'll have to get back to it tomorrow.

Lecrapouille commented 5 years ago
Lecrapouille commented 5 years ago

For the second valgrind bug. It is caused by the 21th texure (I start counting from 0). Other (0 .. 125) seems ok!

    for (unsigned i = 0; i < textures.size(); ++i) {
        if (is_tex_disabled(i)) continue; // skip
        if (i == BLDG_WINDOW_TEX) continue; // not yet generated
cout << "Text " << i << std::endl; 
        textures[i].init();
    }

gives:

Text 21 ==11141== Conditional jump or move depends on uninitialised value(s) ==11141== at 0x5504CD: texture_t::calc_color() (Textures.cpp:530) ==11141== by 0x54FC73: texture_t::init() (Textures.cpp:460) ==11141== by 0x54F296: load_textures() (Textures.cpp:305) ==11141== by 0x1381D8: main (3DWorld.cpp:2206) ==11141== ==11141== Conditional jump or move depends on uninitialised value(s) ==11141== at 0x550564: texture_t::calc_color() (Textures.cpp:530) ==11141== by 0x54FC73: texture_t::init() (Textures.cpp:460) ==11141== by 0x54F296: load_textures() (Textures.cpp:305) ==11141== by 0x1381D8: main (3DWorld.cpp:2206) ==11141== ==11141== Conditional jump or move depends on uninitialised value(s) ==11141== at 0x5505FB: texture_t::calc_color() (Textures.cpp:530) ==11141== by 0x54FC73: texture_t::init() (Textures.cpp:460) ==11141== by 0x54F296: load_textures() (Textures.cpp:305) ==11141== by 0x1381D8: main (3DWorld.cpp:2206) ==11141== Text 22


and iniside the `else` of the `if (ncolors == 4)` of `void texture_t::calc_color()`
fegennari commented 5 years ago
Lecrapouille commented 5 years ago
fegennari commented 5 years ago

I forgot to comment on your stack trace. That's the stack trace for the failing glGetError() call. The actual failing GL command was before that point - between the previous glGetError() call and the failing one. That bad GL call set the error flag, and it's not until we try to read the flag that we know there's an error. That's what makes this so hard to debug. The stack trace is useless. My guess is that the bad call is in the shader setup code before drawing the shadow map. I think this is just before the first actual draw call. I don't know how to track it down without sprinkling more check_gl_error() calls around in the code.

fegennari commented 5 years ago

Okay, that points[1] is definitely garbage for the torus case. But it doesn't really affect anything observable by the user. This makes the collision matrix rasterization treat it as a rotated cylinder, which just means it uses a slower algorithm that's probably negligible runtime difference for a few tori.

The other uninitialized warnings are from drawing where the number of subdivisions of a torus uses the same equation as a cylinder and capsule: a function of min distance from the two "end points" to the player. The garbage points[1] is far outside the scene and is never the closer point, so it ends up returning the distance to points[0], which is the center of the torus, which is what we want in that case anyway. So again the uninitialized values have no observable effect, which is why it went unfixed for so long.

That warning from texture 21 (GEN_TEX) is real, but the value is unused. That texture is just an unused slot. It used to be used for something, but I stopped using it and didn't want to update the texture enums to remove it. I don't actually get a warning for that one. Maybe I have a newer version of valgrind and it realizes the value is never used. In any case, I fixed this one a well.

fegennari commented 5 years ago

I fixed one more uninitialized variable, in a sort function that was comparing a struct field that it didn't need to sort by anyway. Now I only have one error, uninitialized jump in the Radeon driver.

Lecrapouille commented 5 years ago

Nice progress! :) I Still have the valgrind error on the texture uninitialized. Anyway, ok if it's an empty slot never used! I was suspecting it was the cause of the OpenGL crash. This error is still strange because, yesterday before your fix, in void load_textures() if I replaced if (i == BLDG_WINDOW_TEX) continue; by if (i == GEN_TEX) continue; this really fixed the valgrind error. Now, even patch does not fix it.

For Valgrind, do you compile with -O2? Should be -O0. And valgrind --version valgrind-3.12.0.SVN

For the Radeon driver, I guess this is just the lib. I have also leaks with the 'nouveau' driver (even in my project).

I tried to edit the default.txt file:

Retry? [Y|N]



I give up for this OpenGL error :(
fegennari commented 5 years ago

I changed the texture loading so that GEN_TEX returns true for is_tex_disabled(), which should skip the init() (and calc_color()) call. Maybe you have Textures.cpp locally modified and it wasn't updated with a git pull?

I'm leaving in the -O3 when running valgrind. Disabling optimizations may make the stack trace more readable, but I don't think it affects the errors reported. Anyway, my valgrind output is clean now other than the Radeon driver warning.

For that universe shader link error, you have an older driver that doesn't support shader invariants. You need to change the value from 1 to 0 in config_universe.txt: allow_shader_invariants 1 # disable if driver doesn't support this That will disable some minor shader functionality.

fegennari commented 5 years ago

Do you still get that blue screen on the default "mapx" scene? Maybe you can try using a core profile rather than compatibility profile. I found that for some reason the core profile was slower on Windows, so I have it off by default. Just change this line in mapx/config_mapx_base: use_core_context 0 to: use_core_context 1

Lecrapouille commented 5 years ago

Yep, the valgrind error was fixed but the opengl error still get here. ANd the final fix was use_core_context 1 Finally, I have a real scene displayed :) :)

Lecrapouille commented 5 years ago

PS: cool to see something. Good work! By the way, allow_shader_invariants 0 in universe/config_universe.txt makes it works while with other examples, I got my initial gl error back.

In mapx/config_mapx.txt just a suggestion: the mouse seems very sensitive and you should constrain the camera angle. The scene is mostly upside-down just moving a little the mouse and we see the scene from the ground. but this sensitivity is maybe an effect coming from that FPS are around 10. Also, you may activate the backface culling this can help. Thansk again !

fegennari commented 5 years ago

Great! You probably need to add "use_core_context 1" to any config file you use. I wonder why the compat context doesn't work? Maybe I can auto-detect this somehow and select core context in that case. Or always use it on linux? It would really help if I knew which GL call was failing.

The only scene that uses shader invariants is config_universe. Maybe I can auto detect that as well by parsing the returned error message during shader linking.

If you only get 10 FPS it won't be very interactive. There's not much you can do with the mouse. I suppose I can add a mouse sensitivity option, or maybe cap the max turn angle per frame. I wonder why you get 10 FPS? I get over 200 FPS on my Windows machine and it's synced at 60 FPS on linux. Even my old laptop with Intel integrated graphics gets about 20 FPS on the mapx scene. Try disabling grass (press the '2' key), that takes a lot of time. If that's it, you can go into config_mapx.txt and change "grass_density 600" to something smaller, even "grass_density 0".

Lecrapouille commented 5 years ago

I did not have lot of time to test since I change your setting. Effectively I have to copy "use_core_context 1" in all your config files. When I press the F11 it changes the core context and it crashes (as expected).

For the mapx/config_mapx.txt the camera is different: moving the mouse makes it changes its angles while in other scenes, I have to press the mouse button to make the angles changed.

Is the FPS the number displayed on the left corner? I have a very good CPU (Razen 1800X. Only 2% of CPU is used) but my video card just cost me 30€, in less costly scenes it can reach 25 FPS. Grass eats just 2 FPS so it's not the grass.

Nice job anyway! I can close the ticket.

Lecrapouille commented 5 years ago

PS: I tried buildings.txt I have 5 FPS but this time the scene is frozen at 5 FPS while in mapx/config_mapx.txt while displaying 5 FPS, the scene was more fluid (like if it was at 15 FPS). Strange!

Lecrapouille commented 5 years ago

PSS: I can make crash mapx/config_mapx.txt : move the camera away: when leaving the earth the application crashes. I guess it same problem: maybe a missing use_core_context 1in another file.

fegennari commented 5 years ago

I'll see what I can do with crashing when not in core context mode. Core context mode really only changes a few things. One of these must not be supported by your driver. If I can find the failing call, I can probably test for the existence of some extension/value. It may be difficult to debug if I can't reproduce the failure though.

Yes, the number in the lower left corner is the framerate. I'm not sure why your framerate is so low. How much memory does your graphics card have? I get better performance on an older laptop! Maybe your driver/card doesn't support some feature, and it's using software rendering for something?

I have no idea why the mapx scene would crash when you move the camera away. It doesn't do that for me. Does it crash with a GL error, or something else? You should only need to set use_core_context in the config file you're using.

I'm going to add config_pre.txt and config_post.txt that are automatically loaded and can be used to set defaults and overrides. That way you can add "use_core_context 1" to config_post.txt and it will apply to all scenes.

fegennari commented 5 years ago

Okay, I was able to crash the mapx scene when moving far away, but only in use_core_context mode. I'm missing a VBO bind in the terrain mesh drawing. I think what happens is that the VBO is normally bound in the shadow map pass, but when you move far away from the scene the mesh is no longer in the shadow volume and the VBO is never setup. It should be fixed now. I also attempted to fix the mouse sensitivity problem.

fegennari commented 5 years ago

I forgot to comment on this one: "For the mapx/config_mapx.txt the camera is different: moving the mouse makes it changes its angles while in other scenes, I have to press the mouse button to make the angles changed."

I have the following specified in config_mapx_base.txt: "enable_mouse_look 1" This enables mouse look by default, without the user having to use the 'V' key. This is because the mapx scene was designed for gameplay mode where mouse look (rotating the camera without having to hold down the mouse) is useful. We want the mouse button for firing weapons.

fegennari commented 5 years ago

I finally looked up your GTX 720's specs, and they're pretty bad. Here are the cards I have: Windows PC: GTX 1070, G3D mark = 11,373 Linux PC: R9 280, G3D mark = 5,378 Old Windows PC: GTX 770, G3D mark = 6,067 Laptop: Not sure, it's in my office at work Your card: GTX 720: G3D mark = 717 Really? Only 717? That's terrible! I was thinking it was similar to my GTX 770. So, yeah, you need a better GPU to run 3DWorld properly.

Lecrapouille commented 5 years ago

Currently, I cannot test on my desktop. I tried to compile on an older desktop 32-bits Ubuntu. I could not compile dependencies/gli/external/glm/detail/func_common.inl. The template param P is not found. Replace it by Q. Here the correct function:

#   if GLM_ARCH == GLM_ARCH_X86
    template<length_t L, typename T, qualifier Q, bool Aligned>
    struct compute_sign<L, T, Q, false, Aligned>
    {
        GLM_FUNC_QUALIFIER static vec<L, T, Q> call(vec<L, T, Q> const& x)
        {
            T const Shift(static_cast<T>(sizeof(T) * 8 - 1));
            vec<L, T, Q> const y(vec<L, typename make_unsigned<T>::type, Q>(-x) >> typename make_unsigned<T>::type(Shift));

            return (x >> Shift) | y;
        }
    };
#   endif

But when launching I reach new assert but this time I have no GPU so not important.

fegennari commented 5 years ago

GLI seems to keep causing problems, in particular with those qualifiers. Maybe I need to replace it with something else? Does GLI really have a nested GLM library? I don't have any 32-bit OSes to test on.

What assert do you get? One of the usual GL errors? You always have a GPU. Do you mean no graphics card (using integrated graphics)? It should still work. I guess it depends on the motherboard.

fegennari commented 5 years ago

Wow, that's an interesting error! It looks like there was a bug in the X86 GLM code at the time it was copied into GLI. It's been fixed since then in GLM, and the current version of GLI has updated to a newer version where it's fixed. I don't like it how there are two different GLM libraries inside my dependencies directory, GLM itself and an older version of GLM inside of GLI.

I made that one line fix. I can't actually test it because none of my build machines gets into that X86 case. I guess it's uncommon, which is why the bug wasn't found. I'm not sure that's the best fix. Should I update to the newest version of GLI? Copy my newer GLM code into the GLI directory? Change my build includes to use GLM from inside GLI? I'm kind of worried about changing it. Those two libraries have caused a lot of problems, and I've modified them several times. I tried updating to a newer version of GLM a year or two ago and there were compile errors because the apparently made changes that weren't backwards compatible. I don't really want to have to deal with that again.

fegennari commented 5 years ago

I checked any my work laptop has an Intel HD 4600 GPU with a G3D Mark of 712, which is similar to your GPU.

Lecrapouille commented 5 years ago

Does GLI really have a nested GLM library?

Definitively yes because of this issue with my initial compilation error. When I added -I in the Makefile refereing to nested GLM it fixed my error.

I don't have any 32-bit OSes to test on.

In your case you can compile and activate the correct flag (-DGLM_ARCH=GLM_ARCH_X8) this function is rounded by this macro.

What assert do you get? One of the usual GL errors? You always have a GPU. Do you mean no graphics card (using integrated graphics)? It should still work. I guess it depends on the motherboard.

Cube Overlap Removal time = 2
 Add Fixed Cobjs time = 6
bins = 16384, ne = 9803, cobjs = 26071, ent = 119447, per c = 4, per bin = 7
3dworld: /home/qq/3DWorld/src/cobj_bsp_tree.cpp:701: void cobj_bvh_tree::build_tree(unsigned int, unsigned int, unsigned int, cobj_bvh_tree::per_thread_data&): Assertion `vals[0] <= vals[1]' failed.

I'm not a big fan of GLM, since I notices a critical regression (fixed since this time) where matrix ortho changed of sign and breaking API with different behaviors with their constructor. See here. They did not have a basic unit test to check some basic values.