seeder-research / uMagNUS

Other
6 stars 3 forks source link

low performance #16

Open MathieuMoalic opened 2 years ago

MathieuMoalic commented 2 years ago

Hello again,

Is it supposed to be on par with Mumax3 in term of performances ? All simulations I tested were close to 5 times slower. This is with freshly compiled binaries, latest nvidia drivers, on arch linux.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.54       Driver Version: 510.54       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce 3080ti  Off  | 00000000:0A:00.0  On |                  N/A |
| 44%   67C    P2   193W / 350W |   1335MiB / 12288MiB |     74%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
xfong commented 2 years ago

The pre-built binaries and docker recipe contains a generic uMagNUS. Upon launch, it will compile the kernels for the GPU found. On my systems, the compile takes about 3 to 5 seconds. If you run simulations in batches, this compilation step is performed for every file.

What are the kinds of simulations you have tried? Are they small or large? Are you running them in batch mode?

If you want to pre-compile the kernels, you need to manually compile them using the included uMagNUS-clCompiler binary on the target machine. If you are familiar with Linux and work on a Linux machine, you just need to: 1) include the path to the downloaded uMagNUS-clCompiler binary 2) then, download the uMagNUS source code from GitHub to your GOPATH like you would if you tried to compile MuMax3 3) Run "make realclean && make mod libs" in the uMagNUS source directory 4) Copy the libraries in the libumagnus directory to a permanent directory and add that directory to your LD_LIBRARY_PATH variable

If the uMagNUS binary sees the correct library, it launches much faster. If uMagNUS is unable to find a pre-compiled library, a message will be printed onto the screen ("Failed to find binary").

MathieuMoalic commented 2 years ago

I ran several simulations both small and medium sized. For example, I run test/relax-stress.mx3 as it is around 20 seconds long with mumax3 on my system. It takes 1min44s with uMagNUS, I tried both the pre-compiled and the self compiled binaries+libraries.

seeder-research commented 2 years ago

After you have run the "make libs" step, did you add the path to the libumagnus directory to your LD_LIBRARY_PATH? The entire build step has 2 locations for the libumagnus.so library. The one in cl_loader/lib is an empty library. The actual library that is compiled using uMagNUS-clCompiler is saved in the libumagnus directory of the source tree. You may need to copy the files in that directory to /usr/lib on your system. I usually recommend just adding the directory path to the LD_LIBRARY_PATH variable. The order of directories in LD_LIBRARY_PATH matters.

You can contact me directly by email (check https://blog.nus.edu.sg/seeder). At this time, I do not expect the performance to be on par with mumax3. On my systems, they were roughly the same speed.

MathieuMoalic commented 2 years ago

Using these commands:

sudo rm /usr/lib/libumagnus*
sudo rm ~/go/bin/uMagNUS*
sudo rm -r ~/go/src/github.com/seeder-research/uMagNUS*
cd ~/go/src/github.com/seeder-research
git clone git@github.com:seeder-research/uMagNUS
cd uMagNUS
make mod libs uMagNus
LD_LIBRARY_PATH=./libumagnus ~/go/bin/uMagNUS test/relax-stress.mx3

So by using LD_LIBRARY_PATH, the "Unable to get program binary!" message is not shown anymore but the simulation remains as slow as before. The issue might be specific to my hardware ? I will try on an HPC and come back to you.

xfong commented 2 years ago

Minor comment: use the absolute path to the library .so file instead of the relative path

Let me know once you have some results. It could be that some of the kernel launch parameters are not tuned. As you can imagine, that will be difficult as different GPUs have different optimum parameters.

From experience testing the code, bw_euler and relax_stress does run rather slowly. I have not looked into the exact details. It might make sense to compare runs of bw_euler.

I don't think the code will outperform mumax because the OpenCL does not handle out-of-order execution well. Every vendor has their own scheme. The implementation of uMagNUS uses cl_events to synchronize kernel execution based on the cl_buffer being accessed. The GPU driver will then handle the execution order of the kernels. Currently, we stall the code execution until each kernel returns to force synchronization. This was done because the Kepler GPUs were showing some issues with the synchronization. We will consider removing this restriction as the newer GPU architectures are better at this. But in light of SYCL2021, we might just switch over to SyCL in the future and let the SyCL runtime handle the order of kernel execution.

MathieuMoalic commented 2 years ago

Thank you for the insight it's quite interesting, hopefully one day soon it will be more convenient to make cross platform software. I'm having a lot of troubles building on the HPC right now ( it's unrelated to uMagNUS, some shared object files can be hard to find there, libm.so.6 in my case ), I will test on other workstations next week.

xfong commented 2 years ago

Hi Mathieu,

The latest code in develop branch has much more aggressive kernel synchronization mechanism and may give you the speedup you are looking for. You may also edit the Makefile in the source root directory to include compiler optimizations (try adding the -cl-mad-enable switch) to see if you get any speed up. For simulations that requires the demag field to be calculated (or the any FFT operations), the performance may be limited by the FFT library (the best library I found to support the required computations is VkFFT). Let me know how your testing went.

Best, Kelvin Fong

MathieuMoalic commented 2 years ago

Hi, I tested the latest commit on the develop branch but there was no difference in performance sadly.

sudo rm /usr/lib/libumagnus*
sudo rm ~/go/bin/uMagNUS*
sudo rm -r ~/go/src/github.com/seeder-research/uMagNUS*
cd ~/go/src/github.com/seeder-research
git clone git@github.com:seeder-research/uMagNUS
cd uMagNUS
git checkout develop
make mod libs 
sudo cp ./libumagnus/lib* /usr/lib/
make uMagNUS
time ~/go/bin/uMagNUS test/relax-stress.mx3
real    1m57.315s
user    2m40.106s
sys     1m32.424s

to compare to mumax I run this:

sudo rm ~/go/bin/mumax3
sudo rm -r ~/go/src/github.com/mumax/3
cd ~/go/src/github.com/mumax
git clone git@github.com:mumax/3
cd 3
go mod init
cd cmd/mumax3
go install
cd ../..
time ~/go/bin/amumax test/relax-stress.mx3
real    0m20.858s
user    0m18.782s
sys     0m2.113s

I did not observe any change in speed when adding the "-cl-mad-enable" flag in the line 103 the makefile ( hopefully it was the correct place to add it )

xfong commented 2 years ago

Thanks Mathieu. I had tested the demag simulations and noted that the FFTs are probably ok. The relax function uses the rk23 stepper during simulations so the performance hit probably comes from the stepping. This leaves roughly a few places where the performance hit occurs: 1) poor parameters for launching kernels (possible but unlikely) 2) the reduction kernels (the launch parameters are similar to the other kernels, which could result in poor performance. The reduction kernels are called for determining the system energy of every few steps in the relax process. The reducesum kernel is probably a lot slower in uMagNUS right now due to the corrections performed to keep the result accurate. Am working on improvements for this at the moment. I do note the 6x performance hit you observe in my own tests) 3) differences in step error schemes (the implementation in mumax3 is fairly crude whereas the one in uMagNUS is much more traditional. This hypothesis can be tested against one of the mumax3 branches in my fork. In the original update to mumax3, we did not observe any performance hit after this update)

I'm not sure how much effort to commit to optimizing for some of the test files because there are issues in the mumax3 code:

MathieuMoalic commented 2 years ago

I see. I have similar performance results on all the simulations I tried, all kinds of sizes, PBC, magnetic parameters, shapes etc... I simply use relax-stress.mx3 because it's the first simulation in the test folder that I found lasted more than 1 second so I could "benchmark" easily. Sadly I'm having a hard time testing on other workstations because they are running windows and I have no patience debugging on that operating system. I have yet to compile successfuly on HPC.

From the changes you have made so far, do you expect uMagNUS to give more physically correct results than mumax3 ?

xfong commented 2 years ago

Hi Mathieu,

Some optimizations to the reduction kernels were pushed to the develop branch (I got ~30% reduction in runtime). Please test those on examples that are more representative of the problems you want to simulate. I doubt relax-stress.mx3 is a good gauge.

Also, try uMagNUS64 as well to see how much speed loss you have when the computations are in double precision.

MathieuMoalic commented 2 years ago

I cannot run simulations with uMagNUS (64 bits version is the same) compiled from the develop branch as it gives me this error:

// GPU: 0
//// uMagNUS 2.2.2 linux_amd64 go1.18 (gc) 

//// OpenCL C Version OpenCL C 1.2 
// GPU: NVIDIA GeForce RTX 3080 Ti(12050MB) 

////(c) Xuanyao Fong, SEEDER Research Group 

////@ National University of Singapore, Singapore 

////Web site: https://blog.nus.edu.sg/seeder 

////Email: kelvin.xy.fong@nus.edu.sg 

////Source code can be downloaded at https://github.com/seeder-research/uMagNUS 

//This is free software without any warranty. See license.txt
//********************************************************************//
//  If you use uMagNUS in any work or publication,                    //
//  we kindly ask you to cite the references in references.bib        //
//********************************************************************//
////uMagNUS is an OpenCL-based derivative of MuMax 3.10: (c) Arne Vansteenkiste, Dynamat LAB, Ghent University, Belgium 

//output directory: /home/mat/go/src/github.com/seeder-research/ref.out/
//starting GUI at http://127.0.0.1:35367
setgridsize(512, 512, 1)
setcellsize(1e-9, 1e-9, 1e-9)
setpbc(0, 0, 0)
//resizing...
// Initializing geometry 0 %
// Initializing geometry 100 %
edgesmooth = 3
msat = 956e3
aex = 10e-12
alpha = 0.03
setgeom(circle(500e-9))
m = vortex(1, 1)
minimize()
//Using cached kernel: /tmp/uMagNUS64kernel_[512 512 1]_[0 0 0]_[1e-09 1e-09 1e-09]_6_
//********************************************************************//
//Please cite the following references, relevant for your simulation. //
//See bibtex file in output folder for justification.                 //
//********************************************************************//
//   * Vansteenkiste et al., AIP Adv. 4, 107133 (2014).
//   * Exl et al., J. Appl. Phys. 115, 17D118 (2014).
panic: runtime error: index out of range [1] with length 1

goroutine 1 [running, locked to thread]:
github.com/seeder-research/uMagNUS/data64.(*Slice).GetEvent(...)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/data64/slice.go:231
github.com/seeder-research/uMagNUS/opencl64.Resize(0xc0000c8410, 0xc0001774e0, 0x1?)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/opencl64/resize.go:29 +0x9f6
github.com/seeder-research/uMagNUS/engine64.(*render).download.func1()
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/render.go:97 +0x6aa
github.com/seeder-research/uMagNUS/engine64.InjectAndWait.func1()
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:257 +0x26
github.com/seeder-research/uMagNUS/engine64.runWhile(0xc000177620, 0xc0?)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:227 +0x9d
github.com/seeder-research/uMagNUS/engine64.RunWhile(0xc000010298?)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:208 +0x4e
github.com/seeder-research/uMagNUS/engine64.Minimize()
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/minimizer.go:163 +0x23e
reflect.Value.call({0x98b760?, 0xab0550?, 0xab0550?}, {0xa2e9ff, 0x4}, {0x110da68, 0x0, 0xc000177d60?})
        /usr/local/go/src/reflect/value.go:556 +0x845
reflect.Value.Call({0x98b760?, 0xab0550?, 0x1?}, {0x110da68, 0x0, 0x0})
        /usr/local/go/src/reflect/value.go:339 +0xbf
github.com/seeder-research/uMagNUS/script64.(*call).Eval(0xc000244210)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/script64/call.go:61 +0x20f
github.com/seeder-research/uMagNUS/engine64.EvalFile(0xc0000d3f20)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/script.go:103 +0x3e
main.runScript({0x7ffcb1ef3d40, 0xb45970?})
        /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:156 +0x145
main.runFileAndServe({0x7ffcb1ef3d40?, 0x0?})
        /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:127 +0x74
main.main()
        /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:94 +0x29f

The script I am running:

setgridsize(512,512,1)
setcellsize(1e-9,1e-9,1e-9)
setpbc(0,0,0)
edgesmooth=3
msat = 956e3
aex = 10e-12
alpha = 0.03
setgeom(circle(500e-9))
m = vortex(1,1)
minimize()
B_ext = vector(0, 1e-2*sin(2*pi*12e9*t), 0)
run(1e-9)

bash commands :

sudo rm /usr/lib/libumagnus*
sudo rm ~/go/bin/uMagNUS*
sudo rm -r ~/go/src/github.com/seeder-research/uMagNUS*
cd ~/go/src/github.com/seeder-research
git clone git@github.com:seeder-research/uMagNUS
cd uMagNUS
git checkout develop
make mod cl-compiler kernloader kernloader64 libumagnus libumagnus64
sudo cp ./libumagnus/lib* /usr/lib/
make uMagNUS uMagNUS64
\time -p -o benchu ~/go/bin/uMagNUS  ~/go/src/github.com/seeder-research/ref.mx3
\time -p -o benchu64 ~/go/bin/uMagNUS64  ~/go/src/github.com/seeder-research/ref.mx3

\time -p -o benchm ~/go/bin/mumax3  ~/go/src/github.com/seeder-research/ref.mx3
xfong commented 2 years ago

Does the 32-bit version (uMagNUS) run?

MathieuMoalic commented 2 years ago

Not from the develop branch, no.

xfong commented 2 years ago

That's a strange issue. Could you please pull from develop again? I propagated some bug fixes. See if that fixes the problem.

MathieuMoalic commented 2 years ago

I have the same errors with the newest commits.

xfong commented 2 years ago

Hi Mathieu,

Sorry, I am unable to reproduce the errors on my machine. Could you kindly send me the output from clinfo? That might give me a better idea of what is wrong. Thanks, Kelvin Fong

MathieuMoalic commented 2 years ago
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 11.6.127
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd
  Platform Extensions with Version                cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_nv_compiler_options                                           0x400000 (1.0.0)
                                                  cl_nv_device_attribute_query                                     0x400000 (1.0.0)
                                                  cl_nv_pragma_unroll                                              0x400000 (1.0.0)
                                                  cl_nv_copy_opts                                                  0x400000 (1.0.0)
                                                  cl_nv_create_buffer                                              0x400000 (1.0.0)
                                                  cl_khr_int64_base_atomics                                        0x400000 (1.0.0)
                                                  cl_khr_int64_extended_atomics                                    0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
                                                  cl_khr_external_semaphore                                          0x9000 (0.9.0)
                                                  cl_khr_external_memory                                             0x9000 (0.9.0)
                                                  cl_khr_external_semaphore_opaque_fd                                0x9000 (0.9.0)
                                                  cl_khr_external_memory_opaque_fd                                   0x9000 (0.9.0)
  Platform Numeric Version                        0xc00000 (3.0.0)
  Platform Extensions function suffix             NV
  Platform Host timer resolution                  0ns

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     NVIDIA GeForce RTX 3080 Ti
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 3.0 CUDA
  Device UUID                                     b2221b83-8061-ff5a-dad5-9d3917a5903e
  Driver UUID                                     b2221b83-8061-ff5a-dad5-9d3917a5903e
  Valid Device LUID                               No
  Device LUID                                     6d69-637300000000
  Device Node Mask                                0
  Device Numeric Version                          0xc00000 (3.0.0)
  Driver Version                                  510.60.02
  Device OpenCL C Version                         OpenCL C 1.2 
  Device OpenCL C all versions                    OpenCL C                                                         0x400000 (1.0.0)
                                                  OpenCL C                                                         0x401000 (1.1.0)
                                                  OpenCL C                                                         0x402000 (1.2.0)
                                                  OpenCL C                                                         0xc00000 (3.0.0)
  Device OpenCL C features                        __opencl_c_fp64                                                  0xc00000 (3.0.0)
                                                  __opencl_c_images                                                0xc00000 (3.0.0)
                                                  __opencl_c_int64                                                 0xc00000 (3.0.0)
                                                  __opencl_c_3d_image_writes                                       0xc00000 (3.0.0)
  Latest comfornace test passed                   v2021-02-01-00
  Device Type                                     GPU
  Device Topology (NV)                            PCI-E, 0000:0a:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               80
  Max clock frequency                             1665MHz
  Compute Capability (NV)                         8.6
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple (device)     32
  Preferred work group size multiple (kernel)     32
  Warp size (NV)                                  32
  Max sub-groups per work group                   0
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              12636192768 (11.77GiB)
  Error Correction support                        No
  Max memory allocation                           3159048192 (2.942GiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Preferred alignment for atomics                 
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Atomic memory capabilities                      relaxed, work-group scope
  Atomic fence capabilities                       relaxed, acquire/release, work-group scope
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        2293760 (2.188MiB)
  Global Memory cache line size                   128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            268435456 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             32768x32768 pixels
    Max 3D image size                             16384x16384x16384 pixels
    Max number of read image args                 256
    Max number of write image args                32
    Max number of read/write image args           0
  Pipe support                                    No
  Max number of pipe args                         0
  Max active pipe reservations                    0
  Max pipe packet size                            0
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max number of constant args                     9
  Max constant buffer size                        65536 (64KiB)
  Generic address space support                   No
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties (on host)                      
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Device enqueue capabilities                     (n/a)
  Queue properties (on device)                    
    Out-of-order execution                        No
    Profiling                                     No
    Preferred size                                0
    Max size                                      0
  Max queues on device                            0
  Max events on device                            0
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Non-uniform work-groups                       No
    Work-group collective functions               No
    Sub-group independent forward progress        No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  2
    IL version                                    (n/a)
    ILs with version                              <printDeviceInfo:186: get CL_DEVICE_ILS_WITH_VERSION : error -30>
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Built-in kernels with version                   <printDeviceInfo:190: get CL_DEVICE_BUILT_IN_KERNELS_WITH_VERSION : error -30>
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_device_uuid cl_khr_pci_bus_info cl_khr_external_semaphore cl_khr_external_memory cl_khr_external_semaphore_opaque_fd cl_khr_external_memory_opaque_fd
  Device Extensions with Version                  cl_khr_global_int32_base_atomics                                 0x400000 (1.0.0)
                                                  cl_khr_global_int32_extended_atomics                             0x400000 (1.0.0)
                                                  cl_khr_local_int32_base_atomics                                  0x400000 (1.0.0)
                                                  cl_khr_local_int32_extended_atomics                              0x400000 (1.0.0)
                                                  cl_khr_fp64                                                      0x400000 (1.0.0)
                                                  cl_khr_3d_image_writes                                           0x400000 (1.0.0)
                                                  cl_khr_byte_addressable_store                                    0x400000 (1.0.0)
                                                  cl_khr_icd                                                       0x400000 (1.0.0)
                                                  cl_khr_gl_sharing                                                0x400000 (1.0.0)
                                                  cl_nv_compiler_options                                           0x400000 (1.0.0)
                                                  cl_nv_device_attribute_query                                     0x400000 (1.0.0)
                                                  cl_nv_pragma_unroll                                              0x400000 (1.0.0)
                                                  cl_nv_copy_opts                                                  0x400000 (1.0.0)
                                                  cl_nv_create_buffer                                              0x400000 (1.0.0)
                                                  cl_khr_int64_base_atomics                                        0x400000 (1.0.0)
                                                  cl_khr_int64_extended_atomics                                    0x400000 (1.0.0)
                                                  cl_khr_device_uuid                                               0x400000 (1.0.0)
                                                  cl_khr_pci_bus_info                                              0x400000 (1.0.0)
                                                  cl_khr_external_semaphore                                          0x9000 (0.9.0)
                                                  cl_khr_external_memory                                             0x9000 (0.9.0)
                                                  cl_khr_external_semaphore_opaque_fd                                0x9000 (0.9.0)
                                                  cl_khr_external_memory_opaque_fd                                   0x9000 (0.9.0)

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
xfong commented 2 years ago

Thanks Mathieu.

Looks like there was a bug in the way the kernel launch parameters were calculated. The fix is in the develop branch now. The code was also tested on RTX 2080 SUPER, P2000 and P100 (all working). See if it works for you.

Best, Kelvin FONG

MathieuMoalic commented 2 years ago

I'm sorry to say that it's not fully resolved. For example, here is a list of the error codes I have from running the same simulation ( the one from my message yesterday ) with uMagNUS compiled with the latest commits on the develop branch.

0 - 0 - 0 - 2 - 0 - 0 - 0 - 0 - 0 - 2 - 0 - 0 - 0 - 0 - 0 - 2

So 3 out of 15 simulations failed. Same for uMagNUS64: 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0 - 0 - 2 - 0

There is a pattern but I can't really explain it. Also, sometimes the simulations will all fail very quickly and I would have to wait ~10 minutes and it would "work" again. I have seen this pattern twice so far, I cannot reproduce it. Everytime the simulation fails, it is exactly the same error I sent yesterday. I tried recompiling a few times, no change.

xfong commented 2 years ago

Thanks for the feedback. What you observe is a symptom of kernel synchronization problems. Either because of poor kernel launch parameters or kernel synchronization errors that lead to kernels overlapping accesses to memory.

After testing on my laptop, it seems it affects the MX150 as well (running on driver v512). The code runs fine on Intel UHD 630 so there is a chance the problem is due to the driver. I'll see whether I can isolate the issue when I test it in my HPC cluster again. By any chance are you able to test the code on an older NVIDIA driver (say an earlier version of 510)? That might help a lot.

MathieuMoalic commented 2 years ago

I have now downgraded all the way to driver version: 470.103.01

error codes for uMagNUS: 0 - 0 - 0 - 0 - 0 - 0 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 error codes for uMagNUS64: 0 - 0 - 0 - 0 - 0 - 0 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2 - 2

The bash script I used to run the simulations:

for i in {0..30}
    do
    echo "Run $i"
    ~/go/bin/uMagNUS  ~/go/src/github.com/seeder-research/ref.mx3
    printf '%d - ' $? >> out
    ~/go/bin/uMagNUS64  ~/go/src/github.com/seeder-research/ref.mx3
    printf '%d - ' $? >> out64
    done

This is the same behaviour I described in my last comment where all the simulations will fail very fast ( < 1s ). Although the error still gives the same trace, it is different from the "alternating" failed and success simulations which in these cases, if it fails, it will still take the simulation ~10s to fail and I believe it is running properly before it fails. Recompiling does not "reset" this behaviour.

MathieuMoalic commented 2 years ago

These log files might be useful nvidia-smi.log build.log clinfo.log

xfong commented 2 years ago

Thank you Mathieu. I have rolledback the updates to develop (you can view the network of commits). Will use other branches to develop the fix.

The problem seems to affect MX150 and cards later than RTX2080 SUPER. I'll try to figure it out using the information you have given (and on my MX150 machines).

xfong commented 2 years ago

Dear Mathieu,

The kernel launch parameters on MX150 were calculated wrongly, which caused a lot of issues. I am not sure if they work for your RTX 3080 as well but you can test. The branch with the fixes is test_opt3.

Are you using the test files that come with mumax3 for testing? I found that most of the error tolerances there were set for their own observations, which might not match up with those in uMagNUS. However, uMagNUS64 should be able to pass the mumax3 files with the tolerances they set. My own comparisons between uMagNUS and uMagNUS64 show the results are within the error tolerances of the difference in precision.

MathieuMoalic commented 2 years ago

Hello, uMagNUS seems to work flawlessly but uMagNUS64 will give me the same error everytime:

//output directory: /home/mat/go/src/github.com/seeder-research/ref.out/
//starting GUI at http://127.0.0.1:35367
setgridsize(512, 512, 1)
setcellsize(1e-9, 1e-9, 1e-9)
setpbc(0, 0, 0)
//resizing...
// Initializing geometry 0 %
// Initializing geometry 100 %
edgesmooth = 3
msat = 956e3
aex = 10e-12
alpha = 0.03
setgeom(circle(500e-9))
m = vortex(1, 1)
minimize()
//Using cached kernel: /tmp/uMagNUS64kernel_[512 512 1]_[0 0 0]_[1e-09 1e-09 1e-09]_6_
reduceBuf failed: cl: Invalid Value 
WaitForEvents failed in maxvecnorm: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
WaitForEvents failed in maxvecnorm: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
WaitForEvents failed at index 0 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
WaitForEvents failed at index 1 in dot: cl: error -9999 
WaitForEvents failed at index 2 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
WaitForEvents failed at index 0 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
WaitForEvents failed at index 1 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
WaitForEvents failed at index 2 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in MemCpy failed: cl: error -9999 
//********************************************************************//
//Please cite the following references, relevant for your simulation. //
//See bibtex file in output folder for justification.                 //
//********************************************************************//
//   * Vansteenkiste et al., AIP Adv. 4, 107133 (2014).
//   * Exl et al., J. Appl. Phys. 115, 17D118 (2014).
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running, locked to thread]:
github.com/seeder-research/uMagNUS/data64.Copy(0xc0000c80a0, 0xc00041ce60)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/data64/slice.go:246 +0x51c
github.com/seeder-research/uMagNUS/engine64.(*Minimizer).Step(0xc000144080)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/minimizer.go:72 +0x169
github.com/seeder-research/uMagNUS/engine64.step(0x1)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:239 +0x33
github.com/seeder-research/uMagNUS/engine64.runWhile(0xc00013f620, 0xc0?)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:224 +0xa9
github.com/seeder-research/uMagNUS/engine64.RunWhile(0xc000010298?)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:208 +0x4e
github.com/seeder-research/uMagNUS/engine64.Minimize()
    /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/minimizer.go:163 +0x23e
reflect.Value.call({0x989760?, 0xab0a08?, 0xab0a08?}, {0xa2c9ff, 0x4}, {0x110da80, 0x0, 0xc00013fd60?})
    /usr/local/go/src/reflect/value.go:556 +0x845
reflect.Value.Call({0x989760?, 0xab0a08?, 0x1?}, {0x110da80, 0x0, 0x0})
    /usr/local/go/src/reflect/value.go:339 +0xbf
github.com/seeder-research/uMagNUS/script64.(*call).Eval(0xc00025c210)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/script64/call.go:61 +0x20f
github.com/seeder-research/uMagNUS/engine64.EvalFile(0xc0000d3f20)
    /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/script.go:103 +0x3e
main.runScript({0x7ffdbc385d3a, 0xb45f00?})
    /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:156 +0x145
main.runFileAndServe({0x7ffdbc385d3a?, 0x0?})
    /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:127 +0x74
main.main()
    /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:94 +0x29f

Also note that I have not beem running the test suite but I think the results I get from uMagNUS are on par with mumax3.

xfong commented 2 years ago

Thanks Mathieu.

If you run uMagNUS64 with the -debug switch, some information about the GPU will be printed into the screen. Are you able to share that with me? The error you are getting is due to launching kernels with problematic parameters and it seems uMagNUS64 was not able to calculate the correct parameters.

MathieuMoalic commented 2 years ago

How can I use this flag exactly ? I tried:

uMagNUS64 -debug=true ref.mx3
uMagNUS64 --debug ref.mx3
uMagNUS64 -debug ref.mx3

None of these options gave me additional information.

xfong commented 2 years ago

Hi Mathieu,

uMagNUS64 -debug=true ref.mx3 should work. The information it output right in the beginning (should show values of GPUVend, and some other parameters).

MathieuMoalic commented 2 years ago
 uMagNUS64 -debug=true ref.mx3
// GPU: 0
//// uMagNUS 2.2.2 linux_amd64 go1.18 (gc) 

//// OpenCL C Version OpenCL C 1.2 
// GPU: NVIDIA GeForce RTX 3080 Ti(12050MB) 

////(c) Xuanyao Fong, SEEDER Research Group 

////@ National University of Singapore, Singapore 

////Web site: https://blog.nus.edu.sg/seeder 

////Email: kelvin.xy.fong@nus.edu.sg 

////Source code can be downloaded at https://github.com/seeder-research/uMagNUS 

//This is free software without any warranty. See license.txt
//********************************************************************//
//  If you use uMagNUS in any work or publication,                    //
//  we kindly ask you to cite the references in references.bib        //
//********************************************************************//
////uMagNUS is an OpenCL-based derivative of MuMax 3.10: (c) Arne Vansteenkiste, Dynamat LAB, Ghent University, Belgium 

//output directory: ref.out/
//starting GUI at http://127.0.0.1:35367
setgridsize(512, 512, 1)
setcellsize(1e-9, 1e-9, 1e-9)
setpbc(0, 0, 0)
//resizing...
// Initializing geometry 0 %
// Initializing geometry 100 %
edgesmooth = 3
msat = 956e3
aex = 10e-12
alpha = 0.03
setgeom(circle(500e-9))
m = vortex(1, 1)
minimize()
//Using cached kernel: /tmp/uMagNUS64kernel_[512 512 1]_[0 0 0]_[1e-09 1e-09 1e-09]_6_
reduceBuf failed: cl: Invalid Value 
WaitForEvents failed in maxvecnorm: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
WaitForEvents failed in maxvecnorm: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
WaitForEvents failed at index 0 in dot: cl: error -9999 
WaitForEvents failed at index 2 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
WaitForEvents failed at index 1 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
First WaitForEvents in reduceBuf failed: cl: error -9999 
WaitForEvents failed at index 0 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
WaitForEvents failed at index 1 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
WaitForEvents failed at index 2 in dot: cl: error -9999 
First WaitForEvents in MemCpyDtoH failed: cl: error -9999 
First WaitForEvents in MemCpy failed: cl: error -9999 
//********************************************************************//
//Please cite the following references, relevant for your simulation. //
//See bibtex file in output folder for justification.                 //
//********************************************************************//
//   * Vansteenkiste et al., AIP Adv. 4, 107133 (2014).
//   * Exl et al., J. Appl. Phys. 115, 17D118 (2014).
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running, locked to thread]:
github.com/seeder-research/uMagNUS/data64.Copy(0xc000764050, 0xc00034efa0)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/data64/slice.go:246 +0x51c
github.com/seeder-research/uMagNUS/engine64.(*Minimizer).Step(0xc000748600)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/minimizer.go:72 +0x169
github.com/seeder-research/uMagNUS/engine64.step(0x1)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:239 +0x33
github.com/seeder-research/uMagNUS/engine64.runWhile(0xc000147620, 0xc0?)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:224 +0xa9
github.com/seeder-research/uMagNUS/engine64.RunWhile(0xc000010298?)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/run.go:208 +0x4e
github.com/seeder-research/uMagNUS/engine64.Minimize()
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/minimizer.go:163 +0x23e
reflect.Value.call({0x989760?, 0xab0a08?, 0xab0a08?}, {0xa2c9ff, 0x4}, {0x110da80, 0x0, 0xc000147d60?})
        /usr/local/go/src/reflect/value.go:556 +0x845
reflect.Value.Call({0x989760?, 0xab0a08?, 0x1?}, {0x110da80, 0x0, 0x0})
        /usr/local/go/src/reflect/value.go:339 +0xbf
github.com/seeder-research/uMagNUS/script64.(*call).Eval(0xc00074c2a0)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/script64/call.go:61 +0x20f
github.com/seeder-research/uMagNUS/engine64.EvalFile(0xc000021fb0)
        /home/mat/go/src/github.com/seeder-research/uMagNUS/engine64/script.go:103 +0x3e
main.runScript({0x7ffeea82ad66, 0xb45f00?})
        /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:156 +0x145
main.runFileAndServe({0x7ffeea82ad66?, 0x0?})
        /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:127 +0x74
main.main()
        /home/mat/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64/main.go:94 +0x29f
xfong commented 2 years ago

Could you please try the latest develop branch?

MathieuMoalic commented 2 years ago

debug.log

xfong commented 2 years ago

Please try branch 2.2.2 There was a slight difference that I missed in uMagNUS64, which is fixed in that branch.

MathieuMoalic commented 2 years ago

It's fixed ! I ran 30 simulations on both uMagNUS and uMagNUS64 and no errors. I might have noticed an improvement in simulation time as well. It is fair to assume that nvidia's own cuFFT library is incredibly well optimized and catching up to its performance will be a challenge.

xfong commented 2 years ago

Thank Mathieu. I'm still observing a roughly 2x slower simulation as compared to mumax3. Will keep this issue open for the time being as I further optimize the kernel launches.

xfong commented 1 year ago

Hey Mathieu. I just updated the reduction functions in the develop branch to force use of Nvidia's atomic instructions. The update should improve performance on Nvidia cards. Please test the updated code out when you have time and let me know the performance results you have.

MathieuMoalic commented 1 year ago

Hi, I can't compile the cl-compiler target at the moment. It's possible I have some opencl component missing as this is a freshly installed OS. Although I checked and I have opencl-nvidia installed.

make -C ./cl install
make[1]: Entering directory '/home/mat/gh/uMagNUS/cl'
make -C ./stubs all
make[2]: Entering directory '/home/mat/gh/uMagNUS/cl/stubs'
make -C ./lib all
make[3]: Entering directory '/home/mat/gh/uMagNUS/cl/stubs/lib'
gcc -shared -fPIC -Wall -O4 -I../include cl120.cc -o libOpenCL.so.1.2.0
ln -sf libOpenCL.so.1.2.0 libOpenCL.so
make[3]: Leaving directory '/home/mat/gh/uMagNUS/cl/stubs/lib'
make[2]: Leaving directory '/home/mat/gh/uMagNUS/cl/stubs'
go install -v -compiler gc
# github.com/seeder-research/uMagNUS/cl
In file included from _cgo_export.c:4:
memory.go: In function ‘CLGetMemObjectInfoParamSize’:
memory.go:28:48: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion]
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from context.go:7:
./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1159 |                    size_t           param_value_size,
      |                    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘CLGetProgramInfoParamSize’:
program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘CLGetProgramBuildInfoParamSize’:
program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion]
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1346 |                       size_t                param_value_size,
      |                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘CLGetProgramBinary’:
program.go:93:67: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘bytecpy’:
program.go:167:17: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
program.go: In function ‘setPtrs’:
program.go:179:13: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
# github.com/seeder-research/uMagNUS/cl
./device.go: In function ‘CLGetDeviceInfoParamSize’:
./device.go:76:52: warning: passing argument 3 of ‘clGetDeviceInfo’ makes integer from pointer without a cast [-Wint-conversion]
   76 |         return clGetDeviceInfo(device, param_name, NULL, NULL, param_value_size_ret);
      |                                                    ^~~~
      |                                                    |
      |                                                    void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./device.go:4:
./stubs/include/CL/cl.h:969:33: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
  969 |                 size_t          param_value_size,
      |                 ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
# github.com/seeder-research/uMagNUS/cl
./memory.go: In function ‘CLGetMemObjectInfoParamSize’:
./memory.go:28:55: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion]
   28 |         return clGetMemObjectInfo(memobj, param_name, NULL, NULL, param_value_size_ret);
      |                                                       ^~~~
      |                                                       |
      |                                                       void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./memory.go:4:
./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1159 |                    size_t           param_value_size,
      |                    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
# github.com/seeder-research/uMagNUS/cl
./platform.go: In function ‘CLGetPlatformInfoParamSize’:
./platform.go:9:52: warning: passing argument 3 of ‘clGetPlatformInfo’ makes integer from pointer without a cast [-Wint-conversion]
    9 |     return clGetPlatformInfo(platform, param_name, NULL, NULL, param_value_size_ret);
      |                                                    ^~~~
      |                                                    |
      |                                                    void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./platform.go:4:
./stubs/include/CL/cl.h:954:36: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
  954 |                   size_t           param_value_size,
      |                   ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
# github.com/seeder-research/uMagNUS/cl
./program.go: In function ‘CLGetProgramInfoParamSize’:
./program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
   63 |         return clGetProgramInfo(program, param_name, NULL, NULL, param_value_ret_size);
      |                                                      ^~~~
      |                                                      |
      |                                                      void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./program.go:4:
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./program.go: In function ‘CLGetProgramBuildInfoParamSize’:
./program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion]
   77 |         return clGetProgramBuildInfo(program, device, param_name, NULL, NULL, param_value_ret_size);
      |                                                                   ^~~~
      |                                                                   |
      |                                                                   void *
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1346 |                       size_t                param_value_size,
      |                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./program.go: In function ‘CLGetProgramBinary’:
./program.go:93:74: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
   93 |         cl_int err0 = clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NULL, NULL, &param_value_size_ret);
      |                                                                          ^~~~
      |                                                                          |
      |                                                                          void *
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./program.go: In function ‘bytecpy’:
./program.go:167:24: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  167 |         void *srcPtr = src + srcOffset;
      |                        ^~~
./program.go: In function ‘setPtrs’:
./program.go:179:34: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  179 |                         dst[idx] = &src[srcOffset];
      |                                  ^
# github.com/seeder-research/uMagNUS/cl
./queue.go: In function ‘CLGetCommandQueueInfoParamSize’:
./queue.go:9:65: warning: passing argument 3 of ‘clGetCommandQueueInfo’ makes integer from pointer without a cast [-Wint-conversion]
    9 |         return clGetCommandQueueInfo(command_queue, param_name, NULL, NULL, param_value_size_ret);
      |                                                                 ^~~~
      |                                                                 |
      |                                                                 void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./queue.go:4:
./stubs/include/CL/cl.h:1074:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1074 |                       size_t                param_value_size,
      |                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
make[1]: Leaving directory '/home/mat/gh/uMagNUS/cl'
make -C ./cmd/uMagNUS-clCompiler all
make[1]: Entering directory '/home/mat/gh/uMagNUS/cmd/uMagNUS-clCompiler'
go install -v
# github.com/seeder-research/uMagNUS/cl
In file included from _cgo_export.c:4:
memory.go: In function ‘CLGetMemObjectInfoParamSize’:
memory.go:28:48: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion]
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from context.go:7:
./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1159 |                    size_t           param_value_size,
      |                    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘CLGetProgramInfoParamSize’:
program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘CLGetProgramBuildInfoParamSize’:
program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion]
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1346 |                       size_t                param_value_size,
      |                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘CLGetProgramBinary’:
program.go:93:67: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
program.go: In function ‘bytecpy’:
program.go:167:17: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
program.go: In function ‘setPtrs’:
program.go:179:13: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
# github.com/seeder-research/uMagNUS/cl
./device.go: In function ‘CLGetDeviceInfoParamSize’:
./device.go:76:52: warning: passing argument 3 of ‘clGetDeviceInfo’ makes integer from pointer without a cast [-Wint-conversion]
   76 |         return clGetDeviceInfo(device, param_name, NULL, NULL, param_value_size_ret);
      |                                                    ^~~~
      |                                                    |
      |                                                    void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./device.go:4:
./stubs/include/CL/cl.h:969:33: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
  969 |                 size_t          param_value_size,
      |                 ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
# github.com/seeder-research/uMagNUS/cl
./memory.go: In function ‘CLGetMemObjectInfoParamSize’:
./memory.go:28:55: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion]
   28 |         return clGetMemObjectInfo(memobj, param_name, NULL, NULL, param_value_size_ret);
      |                                                       ^~~~
      |                                                       |
      |                                                       void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./memory.go:4:
./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1159 |                    size_t           param_value_size,
      |                    ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
# github.com/seeder-research/uMagNUS/cl
./platform.go: In function ‘CLGetPlatformInfoParamSize’:
./platform.go:9:52: warning: passing argument 3 of ‘clGetPlatformInfo’ makes integer from pointer without a cast [-Wint-conversion]
    9 |     return clGetPlatformInfo(platform, param_name, NULL, NULL, param_value_size_ret);
      |                                                    ^~~~
      |                                                    |
      |                                                    void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./platform.go:4:
./stubs/include/CL/cl.h:954:36: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
  954 |                   size_t           param_value_size,
      |                   ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
# github.com/seeder-research/uMagNUS/cl
./program.go: In function ‘CLGetProgramInfoParamSize’:
./program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
   63 |         return clGetProgramInfo(program, param_name, NULL, NULL, param_value_ret_size);
      |                                                      ^~~~
      |                                                      |
      |                                                      void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./program.go:4:
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./program.go: In function ‘CLGetProgramBuildInfoParamSize’:
./program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion]
   77 |         return clGetProgramBuildInfo(program, device, param_name, NULL, NULL, param_value_ret_size);
      |                                                                   ^~~~
      |                                                                   |
      |                                                                   void *
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1346 |                       size_t                param_value_size,
      |                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./program.go: In function ‘CLGetProgramBinary’:
./program.go:93:74: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion]
   93 |         cl_int err0 = clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NULL, NULL, &param_value_size_ret);
      |                                                                          ^~~~
      |                                                                          |
      |                                                                          void *
./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1338 |                  size_t             param_value_size,
      |                  ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./program.go: In function ‘bytecpy’:
./program.go:167:24: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  167 |         void *srcPtr = src + srcOffset;
      |                        ^~~
./program.go: In function ‘setPtrs’:
./program.go:179:34: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  179 |                         dst[idx] = &src[srcOffset];
      |                                  ^
# github.com/seeder-research/uMagNUS/cl
./queue.go: In function ‘CLGetCommandQueueInfoParamSize’:
./queue.go:9:65: warning: passing argument 3 of ‘clGetCommandQueueInfo’ makes integer from pointer without a cast [-Wint-conversion]
    9 |         return clGetCommandQueueInfo(command_queue, param_name, NULL, NULL, param_value_size_ret);
      |                                                                 ^~~~
      |                                                                 |
      |                                                                 void *
In file included from ./stubs/include/CL/opencl.h:24,
                 from ././opencl.h:14,
                 from ./queue.go:4:
./stubs/include/CL/cl.h:1074:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’
 1074 |                       size_t                param_value_size,
      |                       ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
make[1]: Leaving directory '/home/mat/gh/uMagNUS/cmd/uMagNUS-clCompiler'
seeder-research commented 1 year ago

Hi Mathieu,

I do not see any error. The output is a bunch of warnings.

Best, Xuanyao (Kelvin) Fong

On 28 Sep 2022, at 3:40 PM, Mathieu Moalic @.***> wrote:

 Hi, I can't compile the cl-compiler target at the moment. It's possible I have some opencl component missing as this is a freshly installed OS. Although I checked and I have opencl-nvidia installed.

make -C ./cl install make[1]: Entering directory '/home/mat/gh/uMagNUS/cl' make -C ./stubs all make[2]: Entering directory '/home/mat/gh/uMagNUS/cl/stubs' make -C ./lib all make[3]: Entering directory '/home/mat/gh/uMagNUS/cl/stubs/lib' gcc -shared -fPIC -Wall -O4 -I../include cl120.cc -o libOpenCL.so.1.2.0 ln -sf libOpenCL.so.1.2.0 libOpenCL.so make[3]: Leaving directory '/home/mat/gh/uMagNUS/cl/stubs/lib' make[2]: Leaving directory '/home/mat/gh/uMagNUS/cl/stubs' go install -v -compiler gc

github.com/seeder-research/uMagNUS/cl

In file included from _cgo_export.c:4: memory.go: In function ‘CLGetMemObjectInfoParamSize’: memory.go:28:48: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from context.go:7: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1159 | size_t param_value_size, | ~~~^~~~~~ program.go: In function ‘CLGetProgramInfoParamSize’: program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘CLGetProgramBuildInfoParamSize’: program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1346 | size_t param_value_size, | ~~~~^~~~~~ program.go: In function ‘CLGetProgramBinary’: program.go:93:67: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘bytecpy’: program.go:167:17: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] program.go: In function ‘setPtrs’: program.go:179:13: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

github.com/seeder-research/uMagNUS/cl

./device.go: In function ‘CLGetDeviceInfoParamSize’: ./device.go:76:52: warning: passing argument 3 of ‘clGetDeviceInfo’ makes integer from pointer without a cast [-Wint-conversion] 76 return clGetDeviceInfo(device, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./device.go:4: ./stubs/include/CL/cl.h:969:33: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 969 | size_t param_value_size, | ~~~~^~~~

github.com/seeder-research/uMagNUS/cl

./memory.go: In function ‘CLGetMemObjectInfoParamSize’: ./memory.go:28:55: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] 28 return clGetMemObjectInfo(memobj, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./memory.go:4: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1159 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./platform.go: In function ‘CLGetPlatformInfoParamSize’: ./platform.go:9:52: warning: passing argument 3 of ‘clGetPlatformInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetPlatformInfo(platform, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./platform.go:4: ./stubs/include/CL/cl.h:954:36: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 954 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./program.go: In function ‘CLGetProgramInfoParamSize’: ./program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 63 return clGetProgramInfo(program, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./program.go:4: ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1338 size_t param_value_size, ~~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBuildInfoParamSize’: ./program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] 77 return clGetProgramBuildInfo(program, device, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1346 size_t param_value_size, ~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBinary’: ./program.go:93:74: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 93 cl_int err0 = clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NULL, NULL, &param_value_size_ret); ^~~~
void *

./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ ./program.go: In function ‘bytecpy’: ./program.go:167:24: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 167 | void srcPtr = src + srcOffset; | ^~~ ./program.go: In function ‘setPtrs’: ./program.go:179:34: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 179 | dst[idx] = &src[srcOffset]; | ^

github.com/seeder-research/uMagNUS/cl

./queue.go: In function ‘CLGetCommandQueueInfoParamSize’: ./queue.go:9:65: warning: passing argument 3 of ‘clGetCommandQueueInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetCommandQueueInfo(command_queue, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./queue.go:4: ./stubs/include/CL/cl.h:1074:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1074 | size_t param_value_size, | ~~~~^~~~~~ make[1]: Leaving directory '/home/mat/gh/uMagNUS/cl' make -C ./cmd/uMagNUS-clCompiler all make[1]: Entering directory '/home/mat/gh/uMagNUS/cmd/uMagNUS-clCompiler' go install -v

github.com/seeder-research/uMagNUS/cl

In file included from _cgo_export.c:4: memory.go: In function ‘CLGetMemObjectInfoParamSize’: memory.go:28:48: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from context.go:7: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1159 | size_t param_value_size, | ~~~^~~~~~ program.go: In function ‘CLGetProgramInfoParamSize’: program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘CLGetProgramBuildInfoParamSize’: program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1346 | size_t param_value_size, | ~~~~^~~~~~ program.go: In function ‘CLGetProgramBinary’: program.go:93:67: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘bytecpy’: program.go:167:17: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] program.go: In function ‘setPtrs’: program.go:179:13: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

github.com/seeder-research/uMagNUS/cl

./device.go: In function ‘CLGetDeviceInfoParamSize’: ./device.go:76:52: warning: passing argument 3 of ‘clGetDeviceInfo’ makes integer from pointer without a cast [-Wint-conversion] 76 return clGetDeviceInfo(device, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./device.go:4: ./stubs/include/CL/cl.h:969:33: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 969 | size_t param_value_size, | ~~~~^~~~

github.com/seeder-research/uMagNUS/cl

./memory.go: In function ‘CLGetMemObjectInfoParamSize’: ./memory.go:28:55: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] 28 return clGetMemObjectInfo(memobj, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./memory.go:4: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1159 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./platform.go: In function ‘CLGetPlatformInfoParamSize’: ./platform.go:9:52: warning: passing argument 3 of ‘clGetPlatformInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetPlatformInfo(platform, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./platform.go:4: ./stubs/include/CL/cl.h:954:36: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 954 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./program.go: In function ‘CLGetProgramInfoParamSize’: ./program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 63 return clGetProgramInfo(program, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./program.go:4: ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1338 size_t param_value_size, ~~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBuildInfoParamSize’: ./program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] 77 return clGetProgramBuildInfo(program, device, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1346 size_t param_value_size, ~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBinary’: ./program.go:93:74: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 93 cl_int err0 = clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NULL, NULL, &param_value_size_ret); ^~~~
void *

./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ ./program.go: In function ‘bytecpy’: ./program.go:167:24: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 167 | void srcPtr = src + srcOffset; | ^~~ ./program.go: In function ‘setPtrs’: ./program.go:179:34: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 179 | dst[idx] = &src[srcOffset]; | ^

github.com/seeder-research/uMagNUS/cl

./queue.go: In function ‘CLGetCommandQueueInfoParamSize’: ./queue.go:9:65: warning: passing argument 3 of ‘clGetCommandQueueInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetCommandQueueInfo(command_queue, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./queue.go:4: ./stubs/include/CL/cl.h:1074:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1074 | size_t param_value_size, | ~~~~^~~~~~ make[1]: Leaving directory '/home/mat/gh/uMagNUS/cmd/uMagNUS-clCompiler' — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

xfong commented 1 year ago

Did you clone the repo in the correct directory structure? As far as I understand, you need to first create the GOPATH directory. Then, assuming the go compiler can be found using your PATH, run mkdir -p ${GOPATH}/src/github.com/seeder-research && cd ${GOPATH}/src/github.com/seeder-research && git clone https://github.com/seeder-research/uMagNUS -b develop && cd ${GOPATH}/src/github.com/seeder-research/uMagNUS && make all

Best, Xuanyao (Kelvin) Fong

On 28 Sep 2022, at 11:29 PM, seeder-research @.***> wrote:

 Hi Mathieu,

I do not see any error. The output is a bunch of warnings.

Best, Xuanyao (Kelvin) Fong

On 28 Sep 2022, at 3:40 PM, Mathieu Moalic @.***> wrote:

 Hi, I can't compile the cl-compiler target at the moment. It's possible I have some opencl component missing as this is a freshly installed OS. Although I checked and I have opencl-nvidia installed.

make -C ./cl install make[1]: Entering directory '/home/mat/gh/uMagNUS/cl' make -C ./stubs all make[2]: Entering directory '/home/mat/gh/uMagNUS/cl/stubs' make -C ./lib all make[3]: Entering directory '/home/mat/gh/uMagNUS/cl/stubs/lib' gcc -shared -fPIC -Wall -O4 -I../include cl120.cc -o libOpenCL.so.1.2.0 ln -sf libOpenCL.so.1.2.0 libOpenCL.so make[3]: Leaving directory '/home/mat/gh/uMagNUS/cl/stubs/lib' make[2]: Leaving directory '/home/mat/gh/uMagNUS/cl/stubs' go install -v -compiler gc

github.com/seeder-research/uMagNUS/cl

In file included from _cgo_export.c:4: memory.go: In function ‘CLGetMemObjectInfoParamSize’: memory.go:28:48: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from context.go:7: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1159 | size_t param_value_size, | ~~~^~~~~~ program.go: In function ‘CLGetProgramInfoParamSize’: program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘CLGetProgramBuildInfoParamSize’: program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1346 | size_t param_value_size, | ~~~~^~~~~~ program.go: In function ‘CLGetProgramBinary’: program.go:93:67: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘bytecpy’: program.go:167:17: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] program.go: In function ‘setPtrs’: program.go:179:13: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

github.com/seeder-research/uMagNUS/cl

./device.go: In function ‘CLGetDeviceInfoParamSize’: ./device.go:76:52: warning: passing argument 3 of ‘clGetDeviceInfo’ makes integer from pointer without a cast [-Wint-conversion] 76 return clGetDeviceInfo(device, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./device.go:4: ./stubs/include/CL/cl.h:969:33: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 969 | size_t param_value_size, | ~~~~^~~~

github.com/seeder-research/uMagNUS/cl

./memory.go: In function ‘CLGetMemObjectInfoParamSize’: ./memory.go:28:55: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] 28 return clGetMemObjectInfo(memobj, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./memory.go:4: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1159 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./platform.go: In function ‘CLGetPlatformInfoParamSize’: ./platform.go:9:52: warning: passing argument 3 of ‘clGetPlatformInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetPlatformInfo(platform, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./platform.go:4: ./stubs/include/CL/cl.h:954:36: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 954 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./program.go: In function ‘CLGetProgramInfoParamSize’: ./program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 63 return clGetProgramInfo(program, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./program.go:4: ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1338 size_t param_value_size, ~~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBuildInfoParamSize’: ./program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] 77 return clGetProgramBuildInfo(program, device, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1346 size_t param_value_size, ~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBinary’: ./program.go:93:74: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 93 cl_int err0 = clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NULL, NULL, &param_value_size_ret); ^~~~
void *

./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ ./program.go: In function ‘bytecpy’: ./program.go:167:24: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 167 | void srcPtr = src + srcOffset; | ^~~ ./program.go: In function ‘setPtrs’: ./program.go:179:34: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 179 | dst[idx] = &src[srcOffset]; | ^

github.com/seeder-research/uMagNUS/cl

./queue.go: In function ‘CLGetCommandQueueInfoParamSize’: ./queue.go:9:65: warning: passing argument 3 of ‘clGetCommandQueueInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetCommandQueueInfo(command_queue, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./queue.go:4: ./stubs/include/CL/cl.h:1074:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1074 | size_t param_value_size, | ~~~~^~~~~~ make[1]: Leaving directory '/home/mat/gh/uMagNUS/cl' make -C ./cmd/uMagNUS-clCompiler all make[1]: Entering directory '/home/mat/gh/uMagNUS/cmd/uMagNUS-clCompiler' go install -v

github.com/seeder-research/uMagNUS/cl

In file included from _cgo_export.c:4: memory.go: In function ‘CLGetMemObjectInfoParamSize’: memory.go:28:48: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from context.go:7: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1159 | size_t param_value_size, | ~~~^~~~~~ program.go: In function ‘CLGetProgramInfoParamSize’: program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘CLGetProgramBuildInfoParamSize’: program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1346 | size_t param_value_size, | ~~~~^~~~~~ program.go: In function ‘CLGetProgramBinary’: program.go:93:67: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ program.go: In function ‘bytecpy’: program.go:167:17: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] program.go: In function ‘setPtrs’: program.go:179:13: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

github.com/seeder-research/uMagNUS/cl

./device.go: In function ‘CLGetDeviceInfoParamSize’: ./device.go:76:52: warning: passing argument 3 of ‘clGetDeviceInfo’ makes integer from pointer without a cast [-Wint-conversion] 76 return clGetDeviceInfo(device, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./device.go:4: ./stubs/include/CL/cl.h:969:33: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 969 | size_t param_value_size, | ~~~~^~~~

github.com/seeder-research/uMagNUS/cl

./memory.go: In function ‘CLGetMemObjectInfoParamSize’: ./memory.go:28:55: warning: passing argument 3 of ‘clGetMemObjectInfo’ makes integer from pointer without a cast [-Wint-conversion] 28 return clGetMemObjectInfo(memobj, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./memory.go:4: ./stubs/include/CL/cl.h:1159:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1159 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./platform.go: In function ‘CLGetPlatformInfoParamSize’: ./platform.go:9:52: warning: passing argument 3 of ‘clGetPlatformInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetPlatformInfo(platform, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./platform.go:4: ./stubs/include/CL/cl.h:954:36: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 954 | size_t param_value_size, | ~~~^~~~~~

github.com/seeder-research/uMagNUS/cl

./program.go: In function ‘CLGetProgramInfoParamSize’: ./program.go:63:54: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 63 return clGetProgramInfo(program, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./program.go:4: ./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1338 size_t param_value_size, ~~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBuildInfoParamSize’: ./program.go:77:67: warning: passing argument 4 of ‘clGetProgramBuildInfo’ makes integer from pointer without a cast [-Wint-conversion] 77 return clGetProgramBuildInfo(program, device, param_name, NULL, NULL, param_value_ret_size); ^~~~
void *
./stubs/include/CL/cl.h:1346:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1346 size_t param_value_size, ~~~~^~~~~~ ./program.go: In function ‘CLGetProgramBinary’: ./program.go:93:74: warning: passing argument 3 of ‘clGetProgramInfo’ makes integer from pointer without a cast [-Wint-conversion] 93 cl_int err0 = clGetProgramInfo(program, CL_PROGRAM_BINARY_SIZES, NULL, NULL, &param_value_size_ret); ^~~~
void *

./stubs/include/CL/cl.h:1338:37: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void ’ 1338 | size_t param_value_size, | ~~~~~^~~~~~ ./program.go: In function ‘bytecpy’: ./program.go:167:24: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 167 | void srcPtr = src + srcOffset; | ^~~ ./program.go: In function ‘setPtrs’: ./program.go:179:34: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] 179 | dst[idx] = &src[srcOffset]; | ^

github.com/seeder-research/uMagNUS/cl

./queue.go: In function ‘CLGetCommandQueueInfoParamSize’: ./queue.go:9:65: warning: passing argument 3 of ‘clGetCommandQueueInfo’ makes integer from pointer without a cast [-Wint-conversion] 9 return clGetCommandQueueInfo(command_queue, param_name, NULL, NULL, param_value_size_ret); ^~~~
void *

In file included from ./stubs/include/CL/opencl.h:24, from ././opencl.h:14, from ./queue.go:4: ./stubs/include/CL/cl.h:1074:45: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘void *’ 1074 | size_t param_value_size, | ~~~~^~~~~~ make[1]: Leaving directory '/home/mat/gh/uMagNUS/cmd/uMagNUS-clCompiler' — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were assigned.

MathieuMoalic commented 1 year ago

Indeed I was using the new go modules syntax, the commands you gave worked, until a point... I guess there is possibly a mistake somewhere in the Makefile but I wasn't able to fix it miself. Here is the end of the output from make libs : make[1]: Leaving directory '/home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/script64' make -C ./engine64 all make[1]: Entering directory '/home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/engine64' go install -v make[1]: Leaving directory '/home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/engine64' make -C ./cmd/uMagNUS64 all make[1]: Entering directory '/home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64' go install -v make[1]: Leaving directory '/home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/cmd/uMagNUS64' go install -v -compiler gc github.com/seeder-research/uMagNUS/cmd/... rm -f ./libumagnus/*.cc uMagNUS-clCompiler -args="-cl-opt-disable -cl-mad-enable -cl-finite-math-only -cl-single-precision-constant -cl-fp32-correctly-rounded-divide-sqrt -cl-kernel-arg-info" -std="CL1.2" -iopts="-I/home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/kernels_src" -dump /home/mat/.local/share/go/src/github.com/seeder-research/uMagNUS/kernels_src/Kernels/kernels32.h >> libumagnus/libumagnus.cc /bin/sh: line 1: uMagNUS-clCompiler: command not found make: *** [Makefile:108: libumagnus] Error 127

xfong commented 1 year ago

Hi Mathieu,

You need to make sure the directory where the binaries are output to are in your PATH. The error says it cannot find the uMagNUS-clCompiler binary, which is saved to $GOPATH/bin. If you did not define $GOPATH, then it should default to ~/go/bin.

In any case, the Makefiles have been updated in the latest "develop" branch. You only need to make sure the "go" binary is in your path. Clone the repo and run "make all". The binaries will be created in a "gopath" directory. The umagnus libraries will be created in the "libumagnus" directory.

MathieuMoalic commented 1 year ago

oh thanks, what a silly mistake I made. Anyway I'm afraid the performances are not improved. I ran the following script:

setgridsize(512,512,1)
setcellsize(1e-9,1e-9,1e-9)
setpbc(0,0,0)
edgesmooth=3
msat = 956e3
aex = 10e-12
alpha = 0.03
setgeom(circle(500e-9))
m = vortex(1,1)
minimize()
B_ext = vector(0, 1e-2*sin(2*pi*12e9*t), 0)
run(1e-9)

Time to complete: Mumax3: 22.89 s uMagnus: 175.58 s uMagnus64: 403.15s

Average GPU power draw: Mumax3: 345/350W uMagnus/uMagnus64: 225/350W

Average CPU utilization: Mumax3: 3.1% uMagnus/uMagnus64: 7.2%

xfong commented 1 year ago

Are you running with the updated libumagnus library? Just wanted to confirm.Best,Xuanyao (Kelvin) FongSent from my iPhoneOn Sep 29, 2022, at 15:23, Mathieu Moalic @.**> wrote: oh thanks, what a silly mistake I made. Anyway I'm afraid the performances are not improved. I ran the following script: setgridsize(512,512,1) setcellsize(1e-9,1e-9,1e-9) setpbc(0,0,0) edgesmooth=3 msat = 956e3 aex = 10e-12 alpha = 0.03 setgeom(circle(500e-9)) m = vortex(1,1) minimize() B_ext = vector(0, 1e-2sin(2pi12e9*t), 0) run(1e-9) Time to complete: Mumax3: 22.89 s uMagnus: 175.58 s uMagnus64: 403.15s Average GPU power draw: Mumax3: 345/350W uMagnus/uMagnus64: 225/350W Average CPU utilization: Mumax3: 3.1% uMagnus/uMagnus64: 7.2%

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were assigned.Message ID: @.***>

MathieuMoalic commented 1 year ago

Yes, it's the updated libumagnus library

jplauzie commented 1 year ago

Hi,

I downloaded the new develop branch, and ran the benchmark bench.mx3 (it's the one the mumax guys use for their GPU comparisons, i figured it would be a good test. I slightly modified it to stop at a smaller size, e=12 instead of e=14 like in theirs). . My 1070ti runs out of memory after that). I uploaded it as well (Extension changed to .txt instead of .mx3 to upload, github doesn't seem to like uploading a .mx3).

It's consistent with a ~50% penalty. I ran the script Mathieu posted above, as well: 142.96s on mumax, and 240.01 for umagnus. This is for a Nvidia 1070 ti on Windows.

In case it's useful, I attached the benchmark files, as well as the output from the -sync argument (which gives some timing). Most of the penalty for umagnus seems to still be coming from reducemaxvecdiff2 and reducemaxvecnorm2 (probably because of the reduction/atomic operations): they're ~3 orders of magnitude less efficient. I varied the stepsize in bench.mx3 from 100/1000/10000 to get an idea of the scaling,

I did have some issues with the install, with the updated makefile (I think Windows-specific, in how it was parsing the path), but I'll put that in a separate issue.

Best regards, Josh Lauzier

bench.txt mumaxbenchmark100steps.txt mumaxbenchmark1000steps.txt mumaxbenchmark10000steps.txt

umagnusbenchmark100steps.txt umagnusbenchmark1000steps.txt umagnusbenchmark10000steps.txt

umagnusbench100stepssyncoutput.txt umagnusbench1000stepssyncoutput.txt umagnus10000stepssyncoutput.txt mumaxbench100stepssyncoutput.txt mumaxbench1000stepssyncoutput.txt mumaxbench10000stepssyncoutput.txt