parsiyte / GPPRMon

GPPRMon: GPU Runtime Memory Performance and Power Monitoring Tool. This project introduces a runtime GPU performance, memory access and dissipated power profiler for most of the official GPUs. If you have any question, you can contact us via topcuuburak@gmail.com
https://ceng.iyte.edu.tr/people/isil-oz/
Other
3 stars 1 forks source link

GARDENIA Uygulamalarında V100 Mimarisi için Aldığım Hatalar #2

Open Denizdius opened 5 months ago

Denizdius commented 5 months ago

4 bfs_linear_lb webgoogle 0 1 -çalışmıyor Normalde çalışıyor Similatörde -Wrong hatasını alıyorum similasyon çıktısı sonunda issiue olarak gireceğim

normalde çalışması gereken:

./bfs_linear_lb mtx ../datasets/web-Google 0 1

Breadth-first Search by Xuhao Chen

Reading (.mtx) input file ../datasets/web-Google.mtx

Removing redundent edges... 0 redundent edges are removed

|V| 916428 |E| 5105039

This graph maintains both incomming and outgoing edge-list

Launching CUDA BFS solver (256 threads/CTA) ...

iterations = 33. 

runtime [cuda_linear_lb] = 2.879000 ms. 

Verifying...

runtime [serial] = 67.462000 m

5 /cc_base mtx datasets/web-Google 1 1 >mvt.txt // 1 1 parametrisi gardenia da örnek parametre olarak gösteriliyor ancak hata alıyorum

6 ./cc_warp mtx datasets/web-Google 1 1 >mvt.txt // 1 1 parametresini örnek olarak gösteriyor gene ancak hata aldım dataset değiştirilebilit

free(): invalid next size (fast)

Aborted (core dumped)

Normalde Çalışması Gereken:

./cc_warp mtx ../datasets/web-Google 1 1

Connected Component by Xuhao Chen

Reading (.mtx) input file ../datasets/web-Google.mtx

Removing redundent edges... 1565976 redundent edges are removed

|V| 916428 |E| 8644102

This graph is symmetrized

Launching CUDA CC solver (112 CTAs, 256 threads/CTA) ...

iterations = 5. 

runtime [cuda_warp] = 19.486000 ms. 

Verifying...

runtime [serial] = 127.184000 ms. 

Correct

9 pr_warp issue edeilecek

./pr_warp datasets/web-Google >mvt.txt

pars@PARS:~/Documents/GPPRMon_9$ ./pr_warp mtx datasets/web-Google >mvt.txt

free(): invalid next size (fast)

Aborted (core dumped)

pars@PARS:~/Documents/GPPRMon_9$ ./pr_warp mtx datasets/coPapersDBLP/coPapersDBLP >mvt.txt

free(): invalid pointer

Aborted (core dumped)

./pr_warp ../datasets/coPapersDBLP/coPapersDBLP

PageRank by Xuhao Chen

Usage: ./pr_warp [symmetrize(0/1)]

Example: ./pr_warp mtx web-Google

pars@PARS:~/Documents/gardenia/bin$ ./pr_warp mtx ../datasets/coPapersDBLP/coPapersDBLP

PageRank by Xuhao Chen

Reading (.mtx) input file ../datasets/coPapersDBLP/coPapersDBLP.mtx

Removing redundent edges... 0 redundent edges are removed

|V| 540486 |E| 15245729

This graph maintains both incomming and outgoing edge-list

Launching CUDA PR solver (2112 CTAs, 256 threads/CTA) ...

1 0.639330

2 0.304306

3 0.187817

4 0.124585

5 0.085047

6 0.058994

7 0.041586

8 0.029960

9 0.022044

10 0.016389

11 0.012234

12 0.009142

13 0.006857

14 0.005145

15 0.003845

16 0.002849

17 0.002086

18 0.001506

19 0.001074

20 0.000757

21 0.000527

22 0.000363

23 0.000248

24 0.000168

25 0.000112

26 0.000075

iterations = 26. 

runtime [cuda_pull_warp] = 30.325000 ms. 

Verifying...

1 0.639330

2 0.304306

3 0.187817

4 0.124585

5 0.085047

6 0.058994

7 0.041586

8 0.029960

9 0.022044

10 0.016389

11 0.012234

12 0.009142

13 0.006857

14 0.005145

15 0.003845

16 0.002849

17 0.002086

18 0.001506

19 0.001074

20 0.000757

21 0.000527

22 0.000363

23 0.000248

24 0.000168

25 0.000112

26 0.000075

iterations = 26. 

runtime [serial] = 645.822000 ms. 

Correct

10 mst_topo issue edilecek

./mst_topo datasets/great-britain_osm/great-britain_osm.mtx >mvt.txt

free(): invalid pointer

Aborted (core dumped)

./mst_topo datasets/soc-LiveJournal1.mtx >mvt.txt

free(): invalid pointer

Aborted (core dumped)

Denizdius commented 5 months ago

Ek Olarak : 13 spmv_warp - çalışmıyor (web_google ve twitter_higss için) free invalid pointer;

14 spmv_push -çalışmıyor (web_google ve twitter_higss için) free invalid pointer;

3 ./sssp_linear_lb mtx ../datasets/higgs-twitter/higgs-twitter 0 1 (web-Google 'da -önerilen komut) çalışmıyor (wrong)

Beklenen : /sssp_linear_lb mtx ../datasets/higgs-twitter/higgs-twitter 0 1 Single Source Shortest Path by Xuhao Chen Reading (.mtx) input file ../datasets/higgs-twitter/higgs-twitter.mtx Removing redundent edges... 0 redundent edges are removed |V| 456626 |E| 14855819 This graph maintains both incomming and outgoing edge-list Launching CUDA SSSP solver (block_size = 256) ... iterations = 13. runtime [cuda_linear_lb] = 3.304000 ms. Verifying... iterations = 360493. runtime [verify] = 123.922000 ms. Correct

4 vc_linear_bitset çalışmıyor (ayrıca bilgisayarda da her datasette çalışmıyor soc_Livejornal için hatalıydı)

./vc_linear_bitset mtx ../datasets/germany_osm/germany_osm 1 >mvt.txt vc_linear_bitset: cuda_api_object.h:82: void CUctx_st::add_ptxinfo(const char*, const gpgpu_ptx_sim_info&): Assertion `s != NULL' failed. Aborted (core dumped)

Beklenen : ./vc_linear_bitset mtx ../datasets/germany_osm/germany_osm 1 Vertex Coloring by Xuhao Chen Reading (.mtx) input file ../datasets/germany_osm/germany_osm.mtx Removing redundent edges... 0 redundent edges are removed |V| 11548845 |E| 24738362 This graph is symmetrized Launching CUDA VC solver (256 threads/CTA) ... iterations = 32. runtime[cuda_linear_bitset] = 39.131000 ms, num_colors = 5. Verifying... runtime [serial] = 312.433000 ms, num_colors = 5.

topcuburak commented 5 months ago

Comment 1: You should use bfs_linear_base, not bfs_linear_lb. Other than that, you can simulate the bfs_linear_base with path $./bfs_linear_base mtx web-Google 0 1 command. I just tested it, and it worked regularly.

topcuburak commented 5 months ago

Comment 2: I have just tested cc_base, and I didn't observe any problem. The command I used for the simulation was simulator_main_path$ ./cc_base mtx higgs-twitter_reply 1 1. The reason why I used the higgs-twitter_reply dataset is just to see if there is any problem related to the simulator (higgs-twitter_reply is comparably smaller than web-Google). I also don't think that there is any problem with web-Google.mtx dataset, but if you see a problem at some point, please ensure that most of the kernels are simulated successfully (even if the whole execution cannot be successfully completed) because those might be the kernels we are looking for.

topcuburak commented 5 months ago

Comment 3: I just tried cc_warp for once for just my special interest, since I had already a regularly executing version of cc_base. I suggest you focus on just cc_base primarily.

topcuburak commented 5 months ago

Comment 4: Instead of pr_warp, you should use pr_base as in cc_base since we just need the baseline working versions of those applications. The comment to execute the page ranking application is simulator_path$ ./pr_base higgs-twitter_reply.

topcuburak commented 5 months ago

Comment 5: I think you can skip mst_topo. I observed an error (malloc_consolidate), and I don't have enough time to debug it right now. Unfortunately, I don't remember the details about how I could properly simulate it.

topcuburak commented 5 months ago

Comment 6: Again, just try to simulate spmv_base instead of spmv_warp and spmv_push. You can confirm spmv_base can successfully be simulated with simulationPath$ ./spmv_base mtx 4w 0 1 command.

topcuburak commented 5 months ago

Comment 7: Again the same issue. Just try to simulate sssp_linear_base instead of sssp_linear_lb. You can confirm sssp_linear_base can successfully be simulated with simulationPath$ ./sssp_linear_base mtx 4w 0 1 command.

topcuburak commented 5 months ago

Comment 8: Again the same issue for vc. Just try to simulate vc_linear_base instead of vc_linear_bitset. You can confirm vc_linear_base can successfully be simulated with simulationPath$ ./vc_linear_base mtx 4w 1 command.

topcuburak commented 5 months ago

General comments: @Denizdius