salessa / likwid

Automatically exported from code.google.com/p/likwid
GNU General Public License v3.0
0 stars 0 forks source link

ERROR - [./src/applications/likwid-perfctr.c:149] Failed to read argument string! #159

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. /home/snataraj/lib/likwid-3.1.2/likwid-perfctr -g CPU_CLK_UNHALTED:PMC0 -C 
144,2,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62,66,70,74,78,82,86,90,94,98,102
,106,110,114,118,122,126,130,134,138,142,146,150,154,158,162,166,170,174,178,182
,186,190,194,198,202,206,210,214,218,222 -O -o ls.txt ls
2.
3.

What is the expected output? What do you see instead?
I think the problem is due to the number of characters read limitation as 
option of -C 

if i add one more processor for ex 226 it gives the error

What version of the product are you using?
3.1.2

Please provide any additional information below.

Original issue reported on code.google.com by sur...@gmail.com on 24 Jul 2014 at 4:23

GoogleCodeExporter commented 9 years ago
Hi,

Yes we have a character limitation for the -c and -C statements of 200 chars. 
We have to specify a maximum and 200 chars seemed reasonable.

You seem to have a huge machine, pretty uncommon for bare-metal X86 machines. 
You have two possibilies:
- Define your cores with affinity group notation, something like S0:10@S1:10 
(Socket 0 and 1 get 10 processes each)
- Change the limit of the input reader in src/includes/strUtil.h. There is a 
#define with bSecureInput(200, optarg). Just change the 200 to something you 
like and recompile

I think 200 chars is a good maximum or do you have arguments why we should 
increase this limit?

Greetings,
Thomas

Original comment by Thomas.R...@googlemail.com on 29 Jul 2014 at 8:58

GoogleCodeExporter commented 9 years ago
This is needed to execute program in xeon-phi. Currently with this 200 
character limitation I am not able to pin more than 56 threads to 
diferent cores. If i want to assign my own pinning strategy I am not 
able to do it. So I think it is necessary to support more than 200 
characters.

Original comment by sur...@gmail.com on 29 Jul 2014 at 9:01

GoogleCodeExporter commented 9 years ago
I changed the character limit to 400 in our v3.1 development branch, you have 
to do it manually (the second option in my last post) because I don't think 
that we will release a bugfix version in the near future.

When I look at your -C string, it seems that you want every fourth thread, 
hence one thread per real core. This can also be done with -C N:60:1:4. This 
will expand to 60 threads scheduled on cores 0,3,7,11,... . It's not exactly 
your string but its close to it and much shorter.

Original comment by Thomas.R...@googlemail.com on 29 Jul 2014 at 10:53

GoogleCodeExporter commented 9 years ago
this short form of giving expression does not work with -C.

/home/snataraj/lib/likwid-3.1.2/likwid-perfctr -g 
DATA_READ_MISS_OR_WRITE_MISS:PMC0 -C  N:60:1:4 -O -o 
splashopsmt2/bfs/DRWM/32/56/out1.txt 
/home/snataraj/Galois-2.1.8/build/mic_1/apps/bfs/bfs  -noverify 
/home/snataraj/Galois-2.1.8/build/mic/inputs/random/r
32.gr  -t 56
ERROR - [./src/strUtil.c:196] Parse Error

Original comment by sur...@gmail.com on 29 Jul 2014 at 1:29

GoogleCodeExporter commented 9 years ago
Oh sorry, I missed the expression prefix.
Try: -C E:N:60:1:4
It's not assured that this works. The Expression mode in C is kind of old and 
needs some refinements. So if this does not work, you have to specify the CPUs 
in a list like you tried before.

I changed the status of the issue back to accepted to have it on my TODO list.

Original comment by Thomas.R...@googlemail.com on 29 Jul 2014 at 1:51

GoogleCodeExporter commented 9 years ago
I have already tried -C E:N:60:1:4 and it does not work. Thats why I 
tried the extended version. It will be really handy if we can have -C 
E:N:60:1:4 soon :)

Original comment by sur...@gmail.com on 29 Jul 2014 at 1:54

GoogleCodeExporter commented 9 years ago
when shall i expect a patch for this -C with perfcntr? It will be of 
great use with this flag and will be easy fr my experiments.

Thank you for your patience.

Original comment by sur...@gmail.com on 3 Aug 2014 at 10:40

GoogleCodeExporter commented 9 years ago
Hi,

I updated the code in our repository. Since the repo is currently a mess, I 
extracted the patch for you.

tar -xzf likwid-X.X.X.tar.gz
cd likwid-X.X.X
patch -p0 < "PATH_TO_PATCH"
<edit config.mk if needed>
make

I hope this will fix your problems on Xeon Phi. I was not able to test it on 
the Xeon Phi itself but it worked for my test expressions on my host.

Original comment by Thomas.R...@googlemail.com on 4 Aug 2014 at 9:39

Attachments:

GoogleCodeExporter commented 9 years ago
I tried patching, building and running in xeon-phi. I got the following error

$ /home/snataraj/lib/likwid-3.1.2/likwid-perfctr -g INSTRUCTIONS_EXECUTED:PMC0 
-C E:N:60:1:2 -O -o splashopsmt2/sssp/Inst/2/2/out.txt 
/home/snataraj/Galois-2.1.8/build/mic_1/apps/sssp/sssp  -noverify -delta 8 
/home/snataraj/Galois-2.1.8/build/mic/inputs/random/r2.gr  -t 8
ERROR: Processor list is not unique.

Original comment by sur...@gmail.com on 5 Aug 2014 at 2:38

GoogleCodeExporter commented 9 years ago
Have you tried another expressions? I tried a lot of expressions and had no 
failures at all. The "Processor list is not unique" occurs if there are not 
enough CPUs available to fullfil the expression with unique CPU IDs because 
there was a wrap around.

Can you please give me the output of likwid-pin -p ? 
This prints all the affinity domains and covered CPUs.

If you put
for(i=0;i<numThreads;i++) fprintf(stderr,"Thread %d should run on CPU %d\n",i, 
threads[i]);
in line 313 of src/applications/likwid-perfctr.c, you will see the evaluated 
expression. Maybe you can send me this output, too.

Original comment by Thomas.R...@googlemail.com on 6 Aug 2014 at 10:12

GoogleCodeExporter commented 9 years ago
~/perf_anal $ likwid-pin -p
Domain 0:
         Tag N: 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 
81 85 89 93 97 101 105 109 113 117 121 125 129 133 137 141 145 149 153 
157 161 165 169 173 177 181 185 189 193 197 201 205 209 213 217 221 225 
229 233 0 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 
86 90 94 98 102 106 110 114 118 122 126 130 134 138 142 146 150 154 158 
162 166 170 174 178 182 186 190 194 198 202 206 210 214 218 222 226 230 
234 237 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 
91 95 99 103 107 111 115 119 123 127 131 135 139 143 147 151 155 159 163 
167 171 175 179 183 187 191 195 199 203 207 211 215 219 223 227 231 235 
238 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 
96 100 104 108 112 116 120 124 128 132 136 140 144 148 152 156 160 164 
168 172 176 180 184 188 192 196 200 204 208 212 216 220 224 228 232 236 239
Domain 1:
         Tag S0: 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 
77 81 85 89 93 97 101 105 109 113 117 121 125 129 133 137 141 145 149 
153 157 161 165 169 173 177 181 185 189 193 197 201 205 209 213 217 221 
225 229 233 0 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 
82 86 90 94 98 102 106 110 114 118 122 126 130 134 138 142 146 150 154 
158 162 166 170 174 178 182 186 190 194 198 202 206 210 214 218 222 226 
230 234 237 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 
87 91 95 99 103 107 111 115 119 123 127 131 135 139 143 147 151 155 159 
163 167 171 175 179 183 187 191 195 199 203 207 211 215 219 223 227 231 
235 238 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 
92 96 100 104 108 112 116 120 124 128 132 136 140 144 148 152 156 160 
164 168 172 176 180 184 188 192 196 200 204 208 212 216 220 224 228 232 
236 239
Domain 2:
         Tag C0: 1 2 3 4
Domain 3:
         Tag C1: 5 6 7 8
Domain 4:
         Tag C2: 9 10 11 12
Domain 5:
         Tag C3: 13 14 15 16
Domain 6:
         Tag C4: 17 18 19 20
Domain 7:
         Tag C5: 21 22 23 24
Domain 8:
         Tag C6: 25 26 27 28
Domain 9:
         Tag C7: 29 30 31 32
Domain 10:
         Tag C8: 33 34 35 36
Domain 11:
         Tag C9: 37 38 39 40
Domain 12:
         Tag C10: 41 42 43 44
Domain 13:
         Tag C11: 45 46 47 48
Domain 14:
         Tag C12: 49 50 51 52
Domain 15:
         Tag C13: 53 54 55 56
Domain 16:
         Tag C14: 57 58 59 60
Domain 17:
         Tag C15: 61 62 63 64
Domain 18:
         Tag C16: 65 66 67 68
Domain 19:
         Tag C17: 69 70 71 72
Domain 20:
         Tag C18: 73 74 75 76
Domain 21:
         Tag C19: 77 78 79 80
Domain 22:
         Tag C20: 81 82 83 84
Domain 23:
         Tag C21: 85 86 87 88
Domain 24:
         Tag C22: 89 90 91 92
Domain 25:
         Tag C23: 93 94 95 96
Domain 26:
         Tag C24: 97 98 99 100
Domain 27:
         Tag C25: 101 102 103 104
Domain 28:
         Tag C26: 105 106 107 108
Domain 29:
         Tag C27: 109 110 111 112
Domain 30:
         Tag C28: 113 114 115 116
Domain 31:
         Tag C29: 117 118 119 120
Domain 32:
         Tag C30: 121 122 123 124
Domain 33:
         Tag C31: 125 126 127 128
Domain 34:
         Tag C32: 129 130 131 132
Domain 35:
         Tag C33: 133 134 135 136
Domain 36:
         Tag C34: 137 138 139 140
Domain 37:
         Tag C35: 141 142 143 144
Domain 38:
         Tag C36: 145 146 147 148
Domain 39:
         Tag C37: 149 150 151 152
Domain 40:
         Tag C38: 153 154 155 156
Domain 41:
         Tag C39: 157 158 159 160
Domain 42:
         Tag C40: 161 162 163 164
Domain 43:
         Tag C41: 165 166 167 168
Domain 44:
         Tag C42: 169 170 171 172
Domain 45:
         Tag C43: 173 174 175 176
Domain 46:
         Tag C44: 177 178 179 180
Domain 47:
         Tag C45: 181 182 183 184
Domain 48:
         Tag C46: 185 186 187 188
Domain 49:
         Tag C47: 189 190 191 192
Domain 50:
         Tag C48: 193 194 195 196
Domain 51:
         Tag C49: 197 198 199 200
Domain 52:
         Tag C50: 201 202 203 204
Domain 53:
         Tag C51: 205 206 207 208
Domain 54:
         Tag C52: 209 210 211 212
Domain 55:
         Tag C53: 213 214 215 216
Domain 56:
         Tag C54: 217 218 219 220
Domain 57:
         Tag C55: 221 222 223 224
Domain 58:
         Tag C56: 225 226 227 228
Domain 59:
         Tag C57: 229 230 231 232
Domain 60:
         Tag C58: 233 234 235 236
Domain 61:
         Tag C59: 0 237 238 239

***********************
/perf_anal $ /home/snataraj/lib/likwid-3.1.2/likwid-perfctr -g 
INSTRUCTIONS_EXECUTED:PMC0 -C E:N:60:1:2 -O -o 
splashopsmt2/sssp/Inst/2/2/out.txt 
/home/snataraj/Galois-2.1.8/build/mic_1/apps/sssp/sssp  -noverify -delta 
8 /home/snataraj/Galois-2.1.8/build/mic/inputs/rand
om/r2.gr  -t 8
Thread 0 should run on CPU 1
Thread 1 should run on CPU 3
Thread 2 should run on CPU 5
Thread 3 should run on CPU 7
Thread 4 should run on CPU 9
Thread 5 should run on CPU 11
Thread 6 should run on CPU 13
Thread 7 should run on CPU 15
Thread 8 should run on CPU 17
Thread 9 should run on CPU 19
Thread 10 should run on CPU 21
Thread 11 should run on CPU 23
Thread 12 should run on CPU 25
Thread 13 should run on CPU 27
Thread 14 should run on CPU 29
Thread 15 should run on CPU 31
Thread 16 should run on CPU 33
Thread 17 should run on CPU 35
Thread 18 should run on CPU 37
Thread 19 should run on CPU 39
Thread 20 should run on CPU 41
Thread 21 should run on CPU 43
Thread 22 should run on CPU 45
Thread 23 should run on CPU 47
Thread 24 should run on CPU 49
Thread 25 should run on CPU 51
Thread 26 should run on CPU 53
Thread 27 should run on CPU 55
Thread 28 should run on CPU 57
Thread 29 should run on CPU 59
Thread 30 should run on CPU 1
Thread 31 should run on CPU 3
Thread 32 should run on CPU 5
Thread 33 should run on CPU 7
Thread 34 should run on CPU 9
Thread 35 should run on CPU 11
Thread 36 should run on CPU 13
Thread 37 should run on CPU 15
Thread 38 should run on CPU 17
Thread 39 should run on CPU 19
Thread 40 should run on CPU 21
Thread 41 should run on CPU 23
Thread 42 should run on CPU 25
Thread 43 should run on CPU 27
Thread 44 should run on CPU 29
Thread 45 should run on CPU 31
Thread 46 should run on CPU 33
Thread 47 should run on CPU 35
Thread 48 should run on CPU 37
Thread 49 should run on CPU 39
Thread 50 should run on CPU 41
Thread 51 should run on CPU 43
Thread 52 should run on CPU 45
Thread 53 should run on CPU 47
Thread 54 should run on CPU 49
Thread 55 should run on CPU 51
Thread 56 should run on CPU 53
Thread 57 should run on CPU 55
Thread 58 should run on CPU 57
Thread 59 should run on CPU 59
ERROR: Processor list is not unique.

Original comment by sur...@gmail.com on 6 Aug 2014 at 3:55

GoogleCodeExporter commented 9 years ago
Hi,

I was able to test LIKWID on a Xeon Phi and I think I found the problem you 
have. Since the wrap-around is at 60, it seems that LIKWID has not gathered the 
right number of hardware threads of your Xeon Phi.
I attached a little code snippet to verify my assumption. Please compile and 
run it and send me the output

Compile (on your Host)
icc -mmic HWThread_count.c -o HWThread_count
Run (on the Xeon Phi)
./HWThread_count

For me the output looks like this:
$ ./HWThread_count
Sysconf reports 240 online HWThreads of total 240 HWThreads
Proc returns 240 HWThreads

Original comment by Thomas.R...@googlemail.com on 22 Aug 2014 at 8:58

Attachments:

GoogleCodeExporter commented 9 years ago
$ ssh mic0
~ $ ./HWThread_count
Sysconf reports 240 online HWThreads of total 240 HWThreads
Proc returns 240 HWThreads

Original comment by sur...@gmail.com on 22 Aug 2014 at 9:40

GoogleCodeExporter commented 9 years ago
Damn, that's not the problem then. For me it worked like a charm on the Phi 
hence I have to check it again what's causing the problem on your machine.

Do you now the MPSS version installed on your machine? Do you use the lastest 
release of MPSS?

Original comment by Thomas.R...@googlemail.com on 27 Aug 2014 at 3:40

GoogleCodeExporter commented 9 years ago
I think my mpss is old...it is mpss-3.0

Original comment by sur...@gmail.com on 27 Aug 2014 at 3:49

GoogleCodeExporter commented 9 years ago
Can you update your MPSS? 

Ich checked the oldest version I could find on our systems, version 3.2, and it 
works.

In contrast to your output:
Tag N: 1 5 9 13
The CPU IDs are ordered at both of our Xeon Phis: 
Tag N: 1 2 3 4

Can you supply the output of likwid-topology please. I want to understand this 
misordering of your CPU IDs

Greetings,
Thomas

Original comment by Thomas.R...@googlemail.com on 28 Aug 2014 at 10:30

GoogleCodeExporter commented 9 years ago
Some novelties concerning your problem on Intel Xeon Phis?

Original comment by Thomas.R...@googlemail.com on 21 Oct 2014 at 8:46