Closed clausagerskov closed 4 years ago
@clausagerskov 's output from running clinfo:
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.2 CUDA 9.1.84
Platform Name: NVIDIA CUDA
Platform Vendor: NVIDIA Corporation
Platform Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer
Platform Name: NVIDIA CUDA
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4318
Max compute units: 20
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 64
Max work group size: 1024
Preferred vector width char: 1
Preferred vector width short: 1
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Max clock frequency: 1885Mhz
Address bits: 14757395255531667488
Max memory allocation: 2147483648
Image support: Yes
Max number of images read arguments: 256
Max number of images write arguments: 16
Max image 2D width: 16384
Max image 2D height: 32768
Max image 3D width: 16384
Max image 3D height: 16384
Max image 3D depth: 16384
Max samplers within kernel: 32
Max size of kernel argument: 4352
Alignment (bits) of base address: 4096
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 128
Cache size: 327680
Global memory size: 8589934592
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Error correction support: 0
Profiling timer resolution: 1000
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 007A21C0
Name: GeForce GTX 1080
Vendor: NVIDIA Corporation
Driver version: 390.77
Profile: FULL_PROFILE
Version: OpenCL 1.2 CUDA
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer
seems like i need a different opencl version
Just ran sibernetic again with CUDA 8 with the same error, clinfo: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.2 CUDA 8.0.0 Platform Name: NVIDIA CUDA Platform Vendor: NVIDIA Corporation Platform Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
Platform Name: NVIDIA CUDA
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4318
Max compute units: 20
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 64
Max work group size: 1024
Preferred vector width char: 1
Preferred vector width short: 1
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Max clock frequency: 1885Mhz
Address bits: 14757395255531667488
Max memory allocation: 2147483648
Image support: Yes
Max number of images read arguments: 256
Max number of images write arguments: 16
Max image 2D width: 16384
Max image 2D height: 32768
Max image 3D width: 16384
Max image 3D height: 16384
Max image 3D depth: 16384
Max samplers within kernel: 32
Max size of kernel argument: 4352
Alignment (bits) of base address: 4096
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 128
Cache size: 327680
Global memory size: 8589934592
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Error correction support: 0
Profiling timer resolution: 1000
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 00B882E8
Name: GeForce GTX 1080
Vendor: NVIDIA Corporation
Driver version: 376.51
Profile: FULL_PROFILE
Version: OpenCL 1.2 CUDA
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts
I tried getting the offending section of the owPhysicsSimulator.cpp file to give me more error info, which yielded the message: "Could not open file configuration file" just before the crash. Now, if I run sibernetic specifying the path to e.g. the worm config file, it crashes right after booting up opencl (version 8, which can see my GPU)
An interesting note is that if I run the simulator in Visual studio, I get no crashes. All clues point to problems with paths and finding the source files needed. Probably should have some environment variables defined which is then loaded within the sibernetic source code
I think I found the problem. When sibernetic is run without any path specification, the path to write the muscles_activity_buffer.txt and worm_motion_log.txt is set to be ./buffers, however the program does not check if this folder exists. After I created the folder in the sibernetic root folder and ran the compiled x64 bit exe there, (with -f worm) then it starts running with no problems, even detecting and using my GPU. So the solution just seems to be adding a check to see if all the necessary folders exist upon running and to enable easier debugging of future problems, add a std::cout << "ERROR: " << ex.what() << std::endl; at the places where a try catch is simply ending with an exit() without reporting the actual error.
As a more general comment, there are many places in the source code where either critical folders are not being made/checked or the script relies on a certain program being installed correctly so would be nice to have less outputless try catch and maybe some way to test if all the requirements are installed for running the sibernetic_c302.py script.
The source code seems to now be using unix only functions, so if there is no plan to natively support compilation on windows, maybe this should be put as wont fix?
@clausagerskov could you please clarify what functions do you mean? As far as I know for building sibernetic you just need use only std lib which is a usually a part of any more or less popular C++ compiler and it wont be a problem to compile/run code under windows.
Just to clarify, the code does not use Unix only functions, the code is just not readily compilable in Visual Study without some minor modifications which I will post
I will submit my edits separately, but I have successfully compiled sibernetic on Windows x64 with CUDA 11.1 and am able to run the demos and worm, using python 3.8. One roadblock was that there is a bug in the main_sim.py file on line 59 where j = n/2 needs to be j = int(n/2) which was causing an unhandled error when trying to run the main_py from the sibernetic exe. Would be helpful if sibernetic checks for nullptr return when running pValue = PyObject_CallMethod(pInstance, const_cast<char *>("run"), nullptr); as simply passing a nullptr pValue to PyList_Check(pValue) returns a access violation and a silent crash (owSignalSimulator.cpp)
Windows 10 x64 bit with CUDA 8 and NVIDA 1080 installed with appropriate drivers
Sibernetic crashes when running through visual studio in debug mode with the following input -f worm_crawl_half_resolution -l_from lpath=C:\Users\claus\Desktop\openworm\sibernetic-development\simulations\C2_FW_2018-02-12_14-27-42, which is the path the files generated by sibernetic_c302.py. Visual Studio reports the following: The thread 0x4a8c has exited with code 0 (0x0). The thread 0x26ec has exited with code 0 (0x0). The thread 0x2514 has exited with code 0 (0x0). The thread 0x31d8 has exited with code 0 (0x0). The thread 0x52c0 has exited with code 0 (0x0). The thread 0x3954 has exited with code 0 (0x0). The thread 0x1ba0 has exited with code 0 (0x0). The thread 0x478 has exited with code 0 (0x0). The thread 0x21c0 has exited with code 0 (0x0). The program '[17492] Sibernetic.exe' has exited with code 0 (0x0). attached is a list of files in the directory i run sibernetic through. I have compiled sibernetic for x64 filelist.txt