spacemeshos / post

Spacemesh POST protocol implementation
MIT License
20 stars 21 forks source link

Post-rs integration creates new intializer per every batch #138

Closed pigmej closed 1 year ago

pigmej commented 1 year ago
2023/05/12 09:33:41     DEBUG   initialization: file #10 current position: 9437184, remaining: 57671680
Using provider: [GPU] NVIDIA CUDA/NVIDIA GeForce RTX 3090
device memory: 24259 MB, max_mem_alloc_size: 6064 MB, max_compute_units: 82, max_wg_size: 1024
preferred_wg_size_multiple: 32, kernel_wg_size: 256
Using: global_work_size: 12128, local_work_size: 32
Allocating buffer for input: 32 bytes
Allocating buffer for output: 388096 bytes
Allocating buffer for lookup: 6358564864 bytes
2023/05/12 09:33:46     DEBUG   initialization: file #10 current position: 10485760, remaining: 56623104
Using provider: [GPU] NVIDIA CUDA/NVIDIA GeForce RTX 3090
device memory: 24259 MB, max_mem_alloc_size: 6064 MB, max_compute_units: 82, max_wg_size: 1024
preferred_wg_size_multiple: 32, kernel_wg_size: 256
Using: global_work_size: 12128, local_work_size: 32
Allocating buffer for input: 32 bytes
Allocating buffer for output: 388096 bytes
Allocating buffer for lookup: 6358564864 bytes
2023/05/12 09:33:50     DEBUG   initialization: file #10 current position: 11534336, remaining: 55574528
Using provider: [GPU] NVIDIA CUDA/NVIDIA GeForce RTX 3090
device memory: 24259 MB, max_mem_alloc_size: 6064 MB, max_compute_units: 82, max_wg_size: 1024
preferred_wg_size_multiple: 32, kernel_wg_size: 256
Using: global_work_size: 12128, local_work_size: 32
Allocating buffer for input: 32 bytes
Allocating buffer for output: 388096 bytes
Allocating buffer for lookup: 6358564864 bytes
2023/05/12 09:33:54     DEBUG   initialization: file #10 current position: 12582912, remaining: 54525952
Using provider: [GPU] NVIDIA CUDA/NVIDIA GeForce RTX 3090
device memory: 24259 MB, max_mem_alloc_size: 6064 MB, max_compute_units: 82, max_wg_size: 1024
preferred_wg_size_multiple: 32, kernel_wg_size: 256
Using: global_work_size: 12128, local_work_size: 32
Allocating buffer for input: 32 bytes
Allocating buffer for output: 388096 bytes
Allocating buffer for lookup: 6358564864 bytes
2023/05/12 09:33:59     DEBUG   initialization: file #10 current position: 13631488, remaining: 53477376
Using provider: [GPU] NVIDIA CUDA/NVIDIA GeForce RTX 3090
device memory: 24259 MB, max_mem_alloc_size: 6064 MB, max_compute_units: 82, max_wg_size: 1024
preferred_wg_size_multiple: 32, kernel_wg_size: 256
Using: global_work_size: 12128, local_work_size: 32
Allocating buffer for input: 32 bytes
Allocating buffer for output: 388096 bytes
Allocating buffer for lookup: 6358564864 bytes

It should not work like that. That causes a significant performance drop (more than 50%)