taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.52k stars 2.29k forks source link

Different instances of the same data oriented class trigger re-compilation #8321

Open erizmr opened 1 year ago

erizmr commented 1 year ago

Describe the bug Different instances of the same data oriented class trigger re-compilation. I am wondering is it an expected behavior?

To Reproduce

import taichi as ti
import numpy as np

ti.init(offline_cache=False, log_level=ti.TRACE, print_ir=False)

@ti.data_oriented
class compile_test:
    def __init__(self):
        self.n = 100
        self.particles = ti.field(float, shape=(100, 3))

    @ti.kernel
    def set_vel(self, f: ti.i32, vel: ti.types.ndarray()):
        for i in range(self.n):
            for k in ti.static(range(3)):
                self.particles[f, k] = vel[i, k]

for i in range(10):
    obj = compile_test()
    obj.set_vel(i, np.ones((100, 3), dtype=np.float32) * float(i+1))

Log/Screenshots It can be observed that set_vel is compiled 10 times, i.e., one for each instance.

15856630ecce3de4dcf554590a02495
[T 08/16/23 01:30:58.177 776973] [program.cpp:Program@58] Program initializing...
[T 08/16/23 01:30:58.177 776973] [snode_tree_buffer_manager.cpp:SNodeTreeBufferManager@9] SNode tree buffer manager created.
[D 08/16/23 01:30:58.177 776993] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 08/16/23 01:30:58.177 776993] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 08/16/23 01:30:58.177 776991] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 08/16/23 01:30:58.177 776991] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 08/16/23 01:30:58.177 776994] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 08/16/23 01:30:58.177 776992] [parallel_executor.cpp:worker_loop@71] Starting worker thread.
[D 08/16/23 01:30:58.177 776992] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[D 08/16/23 01:30:58.177 776994] [parallel_executor.cpp:worker_loop@86] Worker thread initialized and running.
[T 08/16/23 01:30:58.177 776973] [llvm_context.cpp:TaichiLLVMContext@73] Creating Taichi llvm context for arch: x64
[T 08/16/23 01:30:58.177 776973] [llvm_context.cpp:get_this_thread_data@869] Creating thread local data for thread 140177317147840
[T 08/16/23 01:30:58.192 776973] [llvm_context.cpp:TaichiLLVMContext@137] Taichi llvm context created.
[T 08/16/23 01:30:58.197 776973] [program.cpp:Program@170] Program (0x273e940) arch=x64 initialized.
[T 08/16/23 01:30:58.198 776973] [misc.py:init@462] Materializing runtime...
[T 08/16/23 01:30:58.198 776973] [host_memory_pool.cpp:HostMemoryPool@17] Memory pool created. Default buffer size per allocator = 1024 MB
[T 08/16/23 01:30:58.198 776973] [unified_allocator.cpp:allocate@72] Allocating virtual address space of size 1024 MB
[T 08/16/23 01:30:58.198 776973] [llvm_runtime_executor.cpp:materialize_runtime@661] Launching runtime_initialize
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=35240 remain=1073741824
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=1048576 remain=1073706584
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=320 remain=1072656384
[T 08/16/23 01:30:58.257 776973] [llvm_runtime_executor.cpp:materialize_runtime@671] LLVMRuntime initialized (excluding `root`)
[T 08/16/23 01:30:58.257 776973] [llvm_runtime_executor.cpp:materialize_runtime@674] LLVMRuntime pointer fetched
[T 08/16/23 01:30:58.257 776973] [llvm_runtime_executor.cpp:materialize_runtime@682] Initializing 16 random states (serially)
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=56 remain=1072656064
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=1048616 remain=1072652232
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=1048616 remain=1071599576
[T 08/16/23 01:30:58.257 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=1048616 remain=1070546904
[T 08/16/23 01:30:58.258 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.258 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 0
[T 08/16/23 01:30:58.258 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069494232
[T 08/16/23 01:30:58.258 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_0 in AutodiffMode.NONE...
[D 08/16/23 01:30:58.272 776973] [kernel_compilation_manager.cpp:KernelCompilationManager@56] Create KernelCompilationManager with offline_cache_file_path = /home/mingrui/.cache/taichi/ticache
[T 08/16/23 01:30:58.272 776992] [llvm_context.cpp:get_this_thread_data@869] Creating thread local data for thread 140176789767936
[T 08/16/23 01:30:58.272 776992] [llvm_context.cpp:clone_module_to_this_thread_context@239] Cloning struct module
[T 08/16/23 01:30:58.307 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.307 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 1
[T 08/16/23 01:30:58.307 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069486080
[T 08/16/23 01:30:58.308 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_1 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.310 776994] [llvm_context.cpp:get_this_thread_data@869] Creating thread local data for thread 140177249552128
[T 08/16/23 01:30:58.310 776994] [llvm_context.cpp:clone_module_to_this_thread_context@239] Cloning struct module
[T 08/16/23 01:30:58.310 776994] [llvm_context.cpp:clone_module_to_this_thread_context@239] Cloning struct module
[T 08/16/23 01:30:58.344 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.344 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 2
[T 08/16/23 01:30:58.344 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069481984
[T 08/16/23 01:30:58.344 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_2 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.346 776993] [llvm_context.cpp:get_this_thread_data@869] Creating thread local data for thread 140176831715072
[T 08/16/23 01:30:58.346 776993] [llvm_context.cpp:clone_module_to_this_thread_context@239] Cloning struct module
[T 08/16/23 01:30:58.347 776993] [llvm_context.cpp:clone_module_to_this_thread_context@239] Cloning struct module
[T 08/16/23 01:30:58.347 776993] [llvm_context.cpp:clone_module_to_this_thread_context@239] Cloning struct module
[T 08/16/23 01:30:58.382 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.382 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 3
[T 08/16/23 01:30:58.382 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069477888
[T 08/16/23 01:30:58.382 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_3 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.412 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.412 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 4
[T 08/16/23 01:30:58.412 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069473792
[T 08/16/23 01:30:58.412 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_4 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.441 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.441 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 5
[T 08/16/23 01:30:58.441 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069469696
[T 08/16/23 01:30:58.441 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_5 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.470 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.470 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 6
[T 08/16/23 01:30:58.470 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069465600
[T 08/16/23 01:30:58.470 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_6 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.500 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.500 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 7
[T 08/16/23 01:30:58.500 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069461504
[T 08/16/23 01:30:58.500 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_7 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.530 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.530 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 8
[T 08/16/23 01:30:58.530 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069457408
[T 08/16/23 01:30:58.530 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_8 in AutodiffMode.NONE...
[T 08/16/23 01:30:58.559 776973] [llvm_runtime_executor.cpp:initialize_llvm_runtime_snodes@401] Allocating data structure of size 1200 bytes
[T 08/16/23 01:30:58.560 776973] [snode_tree_buffer_manager.cpp:allocate@41] allocating memory for SNode Tree 9
[T 08/16/23 01:30:58.560 776973] [unified_allocator.cpp:allocate@51] UM [data=140175048966144] allocate() request=4096 remain=1069453312
[T 08/16/23 01:30:58.560 776973] [kernel_impl.py:materialize@566] Compiling kernel set_vel_c74_9 in AutodiffMode.NONE...
...
jim19930609 commented 1 year ago

Interesting, this indicates that we didn't handle data-oriented-class type correctly. I actually doubt that we've never treated data-oriented-class as type during compilation before.

erizmr commented 1 year ago

Thanks for the reply. Just wondering how do you think/ how much effort is expected to put in if we would like to fix it?

jim19930609 commented 1 year ago

hmmm, it's gonna take a while to implement that for data_oriented I'm afraid, however the alternative replacement you can seek for is ArgumentPack: https://docs.taichi-lang.org/docs/master/argument_pack. Maybe you can consider rewrite your code with ArgumentPack though