Raphael-Hao / brainstorm

Compiler for Dynamic Neural Networks
43 stars 2 forks source link

Failed to run `python linear_3078_768.py` #50

Closed mazhaojia123 closed 6 months ago

mazhaojia123 commented 6 months ago

Hello, @Raphael-Hao .

I run the following commands but they failed. I use the docker image from docker run --name brt_ae -ti ghcr.io/raphael-hao/brt:latest /bin/bash.


cd brainstorm/benchmark/micro/kernels
python linear_3078_768.py

The output is as follow.

root@9d0f6086403f:~/brainstorm_project/brainstorm/benchmark/micro/kernels# python linear_3078_768.py
Get devices for measurement successfully!
#### # Linerar 224 3072 768
/brainstorm_project/brainstorm/.cache/log/kernel_tune/NVIDIA_A100_PCIE_40GB/Linear_forward_input_0_224_3072_output_0_224_768_in_features_3072_out_features_768.json
#### Find incomplete record, continue
/opt/miniconda3/lib/python3.8/site-packages/xgboost/training.py:17: UserWarning: Old style callback is deprecated.  See: https://xgboost.readthedocs.io/en/latest/python/callbacks.html
  warnings.warn(f'Old style callback is deprecated.  See: {link}', UserWarning)
----------------------------------------------------------------------
------------------------------  [ Call init-search callbacks ]
----------------------------------------------------------------------
SearchPolicy: Loaded 10 measurement records from /brainstorm_project/brainstorm/.cache/log/kernel_tune/NVIDIA_A100_PCIE_40GB/Linear_forward_input_0_224_3072_output_0_224_768_in_features_3072_out_features_768.json for ["218090bc617f9a36fbabdf253ec28064", [224, 3072], [768, 3072], [224, 768]]
----------------------------------------------------------------------
------------------------------  [ Task Scheduler ]
----------------------------------------------------------------------
|  ID  |                       Task Description                        | Latency (ms) | Speed (GFLOPS) | Trials |
-----------------------------------------------------------------------------------------------------------------
|    0 |                                         vm_mod_fused_nn_dense |        1.261 |         838.41 |     10 |
-----------------------------------------------------------------------------------------------------------------
Estimated total latency: 1.261 ms       Trials: 0       Used time : 1 s Next ID: 0
----------------------------------------------------------------------
------------------------------  [ Search ]
----------------------------------------------------------------------
Generate Sketches               #s: 1
Sample Initial Population       #s: 69  fail_ct: 1979   Time elapsed: 0.72
GA Iter: 0      Max score: 0.9915       Min score: 0.6435       #Pop: 20        #M+: 0  #M-: 0
GA Iter: 4      Max score: 0.9999       Min score: 0.9970       #Pop: 20        #M+: 1390       #M-: 0
EvolutionarySearch              #s: 20  Time elapsed: 5.04
----------------------------------------------------------------------
------------------------------  [ Measure ]
----------------------------------------------------------------------
Get 10 programs to measure:
..........
**********Time elapsed for measurement: 25.66 s
----------------------------------------------------------------------
------------------------------  [ Train cost model ]
----------------------------------------------------------------------
Time elapsed for training: 0.12 s
Get devices for measurement successfully!
#### # Linerar 320 3072 768
/brainstorm_project/brainstorm/.cache/log/kernel_tune/NVIDIA_A100_PCIE_40GB/Linear_forward_input_0_320_3072_output_0_320_768_in_features_3072_out_features_768.json
#### Find incomplete record, continue
/opt/miniconda3/lib/python3.8/site-packages/xgboost/training.py:17: UserWarning: Old style callback is deprecated.  See: https://xgboost.readthedocs.io/en/latest/python/callbacks.html
  warnings.warn(f'Old style callback is deprecated.  See: {link}', UserWarning)
----------------------------------------------------------------------
------------------------------  [ Call init-search callbacks ]
----------------------------------------------------------------------
SearchPolicy: Loaded 10 measurement records from /brainstorm_project/brainstorm/.cache/log/kernel_tune/NVIDIA_A100_PCIE_40GB/Linear_forward_input_0_320_3072_output_0_320_768_in_features_3072_out_features_768.json for ["218090bc617f9a36fbabdf253ec28064", [320, 3072], [768, 3072], [320, 768]]
----------------------------------------------------------------------
------------------------------  [ Task Scheduler ]
----------------------------------------------------------------------
|  ID  |                       Task Description                        | Latency (ms) | Speed (GFLOPS) | Trials |
-----------------------------------------------------------------------------------------------------------------
|    0 |                                         vm_mod_fused_nn_dense |        1.427 |        1058.22 |     10 |
-----------------------------------------------------------------------------------------------------------------
Estimated total latency: 1.427 ms       Trials: 0       Used time : 0 s Next ID: 0
----------------------------------------------------------------------
------------------------------  [ Search ]
----------------------------------------------------------------------
Generate Sketches               #s: 1
Sample Initial Population       #s: 80  fail_ct: 1968   Time elapsed: 0.40
GA Iter: 0      Max score: 0.9939       Min score: 0.7163       #Pop: 20        #M+: 0  #M-: 0
GA Iter: 4      Max score: 0.9999       Min score: 0.9962       #Pop: 20        #M+: 1386       #M-: 0
EvolutionarySearch              #s: 20  Time elapsed: 5.26
----------------------------------------------------------------------
------------------------------  [ Measure ]
----------------------------------------------------------------------
Get 10 programs to measure:
..........
**********Time elapsed for measurement: 24.78 s
----------------------------------------------------------------------
------------------------------  [ Train cost model ]
----------------------------------------------------------------------
Time elapsed for training: 0.13 s
Get devices for measurement successfully!
#### # Linerar 416 3072 768
/brainstorm_project/brainstorm/.cache/log/kernel_tune/NVIDIA_A100_PCIE_40GB/Linear_forward_input_0_416_3072_output_0_416_768_in_features_3072_out_features_768.json
#### Find incomplete record, continue
/opt/miniconda3/lib/python3.8/site-packages/xgboost/training.py:17: UserWarning: Old style callback is deprecated.  See: https://xgboost.readthedocs.io/en/latest/python/callbacks.html
  warnings.warn(f'Old style callback is deprecated.  See: {link}', UserWarning)
----------------------------------------------------------------------
------------------------------  [ Call init-search callbacks ]
----------------------------------------------------------------------
SearchPolicy: Loaded 10 measurement records from /brainstorm_project/brainstorm/.cache/log/kernel_tune/NVIDIA_A100_PCIE_40GB/Linear_forward_input_0_416_3072_output_0_416_768_in_features_3072_out_features_768.json for ["218090bc617f9a36fbabdf253ec28064", [416, 3072], [768, 3072], [416, 768]]
----------------------------------------------------------------------
------------------------------  [ Task Scheduler ]
----------------------------------------------------------------------
|  ID  |                       Task Description                        | Latency (ms) | Speed (GFLOPS) | Trials |
-----------------------------------------------------------------------------------------------------------------
|    0 |                                         vm_mod_fused_nn_dense |        1.262 |        1555.10 |     10 |
-----------------------------------------------------------------------------------------------------------------
Estimated total latency: 1.262 ms       Trials: 0       Used time : 0 s Next ID: 0
----------------------------------------------------------------------
------------------------------  [ Search ]
----------------------------------------------------------------------
Generate Sketches               #s: 1
Sample Initial Population       #s: 97  fail_ct: 1951   Time elapsed: 0.41
GA Iter: 0      Max score: 0.9900       Min score: 0.8812       #Pop: 20        #M+: 0  #M-: 0
GA Iter: 4      Max score: 0.9999       Min score: 0.9977       #Pop: 20        #M+: 1402       #M-: 0
EvolutionarySearch              #s: 20  Time elapsed: 5.08
----------------------------------------------------------------------
------------------------------  [ Measure ]
----------------------------------------------------------------------
Get 10 programs to measure:
..........
**********Time elapsed for measurement: 21.41 s
----------------------------------------------------------------------
------------------------------  [ Train cost model ]
----------------------------------------------------------------------
Time elapsed for training: 0.13 s
Exception ignored in: <function LocalRPCMeasureContext.__del__ at 0x7fb390c59b80>
Traceback (most recent call last):
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/auto_scheduler/measure.py", line 588, in __del__
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/tracker.py", line 468, in terminate
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/tracker.py", line 455, in _stop_tracker
AttributeError: 'NoneType' object has no attribute 'socket'
Exception ignored in: <function Tracker.__del__ at 0x7fb27b93fa60>
Traceback (most recent call last):
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/tracker.py", line 477, in __del__
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/tracker.py", line 468, in terminate
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/tracker.py", line 455, in _stop_tracker
AttributeError: 'NoneType' object has no attribute 'socket'
Exception ignored in: <function Server.__del__ at 0x7fb3c2e34430>
Traceback (most recent call last):
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/server.py", line 493, in __del__
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/rpc/server.py", line 489, in terminate
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/contrib/popen_pool.py", line 137, in kill
  File "/root/brainstorm_project/brainstorm/3rdparty/tvm/python/tvm/contrib/popen_pool.py", line 43, in kill_child_processes
ImportError: sys.meta_path is None, Python is likely shutting down
Raphael-Hao commented 6 months ago

This may be a bug related to TVM, which I didn't meet. You might need to investigate by yourself. E.g., upgrade TVM to a newer version.

mazhaojia123 commented 6 months ago

This may be a bug related to TVM, which I didn't meet. You might need to investigate by yourself. E.g., upgrade TVM to a newer version.

Thank you, @Raphael-Hao . But I don't change anything about the docker image. I also tried several versions of tvm, such as 0.10.0 and 0.16.dev. But it seems that newer tvm is not capatible with brt.

Could you please tell me which version of tvm you use?

Raphael-Hao commented 6 months ago

This scripts is only used to provide an example for tuning the kernels. You can tune them with a new TVM and inject the kernel sources into the kernel db for fusion. Braintorm's fusion does not rely on specific version of tvm. I will update the scripts. But it may not be very timely.

mazhaojia123 commented 6 months ago

Thank you, @Raphael-Hao . I check the kernel_db.sqlite again. And I found that the kernels had been inserted into the database. The exception has no effect on the kernel generation. I'll close the issue. Thank you again. : )