jeongminpark417 / GIDS

25 stars 6 forks source link

in heterogeneous_train.py if we make it run 10000 iteration after warm up, the report will miss training time and e2e time. #23

Open gaowayne opened 3 days ago

gaowayne commented 3 days ago

Hello dea expert if I manually modify below code part from 100 to 10000


            if(step == warm_up_iter + 100):
                print("Performance for 100 iteration after 1000 iteration")
                e2e_time += time.time() - e2e_time_start 
                train_dataloader.print_stats()
                train_dataloader.print_timer()
                print_times(transfer_time, train_time, e2e_time)

                batch_input_time = 0
                transfer_time = 0
                train_time = 0
                e2e_time = 0

                #Just testing 100 iterations remove the next line if you do not want to halt
                return None

then the report will mssing training time, e2e time etc as below

root@salab-hpedl380g11-01:~/wayne/gids/GIDS/evaluation# ./test1.sh
GIDS DataLoader Setting
GIDS:  True
CPU Feature Buffer:  True
Window Buffering:  True
Storage Access Accumulator:  True
Dataset: IGB
SSD are not assigned
ssd list:  None
SSD index: 0
SQs: 255        CQs: 255        n_qps: 128
Ctrl sizes: 1
n pages: 1048576
page size: 4096
num elements: 563200000000
n_ranges_bits: 6
n_ranges_mask: 63
pages_dma: 0x7f43bc010000       220020410000
HEREN
Cond1
100000 8 1 100000
Finish Making Page Cache
Number of required storage accesses:  854.0499999999993
  0%|                                                                                                                                                                | 0/1 [00:00<?, ?it/s]
warp up done
GIDS time:  34.89626216888428
WB time:  0.11425375938415527
print stats: 
print array reset: #READ IOs: 0 #Accesses:1319192128    #Misses:1024557440      Miss Rate:0.776655      #Hits: 294634688        Hit Rate:0.223345       CLSize:4096     Debug Cnt: 0
*********************************

print ctrl reset 0: ------------------------------------
#SSDAccesses:   32017420

Kernel Time:     28507.6
Total Access:    175223213
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [04:10<00:00, 250.98s/it
jeongminpark417 commented 2 days ago

Can you share the command line that you used to run this?