Closed GoogleCodeExporter closed 9 years ago
pyrit.lover: Yep all the cores ( except v2 ) report peak performance. But even
for v2
your method of testing is most accurate.
odlan3: There is not much transfer involved. So for pyrit it's not such a big
issue.
Whole transfer takes ~1% of time and gpu->cpu is 2/7 of this. You can read more
about
slow pci transfer problem here -
http://forums.amd.com/devforum/messageview.cfm?catid=328&threadid=130923&enterth
read=y
Original comment by hazema...@gmail.com
on 22 Apr 2010 at 7:57
As a small bonus to all who helped with testing I'm attaching v3 core.
This one has improved kernel and uses latest svn version of CAL++ library.
Original comment by hazema...@gmail.com
on 23 Apr 2010 at 12:47
Attachments:
Here's the results from my 4850s (oc'd core to 650, mem is stock) on v2c:
Pyrit 0.3.1-dev (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+
Running benchmark (38905.1 PMKs/s)... |
Computed 40537.35 PMKs/s total.
#1: 'CAL++ Device #1 'ATI RV770'': 19610.6 PMKs/s (RTT 2.7)
#2: 'CAL++ Device #2 'ATI RV770'': 19215.3 PMKs/s (RTT 2.9)
Looks like a keeper to me.
@pyrit.lover: I responded to your overclock questions in the google group as
lukas
requested in comment 94. Don't know if you saw it.
Original comment by robert.b...@gmail.com
on 23 Apr 2010 at 12:55
My 4850s' (650 core, 993 mem) results using calpp-svn-1 and cpyrit_calpp-v3:
Pyrit 0.3.1-dev (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+
Running benchmark (39441.6 PMKs/s)... \
Computed 41032.62 PMKs/s total.
#1: 'CAL++ Device #1 'ATI RV770'': 20140.0 PMKs/s (RTT 2.7)
#2: 'CAL++ Device #2 'ATI RV770'': 19965.0 PMKs/s (RTT 2.8)
Looks like you managed to squeeze some more juice out of it. Great job!
Original comment by robert.b...@gmail.com
on 23 Apr 2010 at 4:10
@hazeman11: Great job...I am impressed
pyrit benchmark
Pyrit 0.3.1-dev (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+
Running benchmark (142081.1 PMKs/s)... -
Computed 145429.78 PMKs/s total.
#1: 'CAL++ Device #1 'ATI CYPRESS'': 71047.7 PMKs/s (RTT 1.0)
#2: 'CAL++ Device #2 'ATI CYPRESS'': 71352.4 PMKs/s (RTT 1.0)
Original comment by odl...@gmail.com
on 23 Apr 2010 at 7:47
I'm happy that you like it :). This is my way of thanking you all for helping
with
tests :). I really appreciate it.
I'm only sorry that speedup on 4xxx was so small. It could be ~10% but CAL IL
compiler ( in ATI driver ) has issues. When the pressure on instruction
scheduling
has been reduced it started to use smaller ALU clauses to save registers -
which is
totally bad direction for this type of kernel :(.
PS. Could someone with 5xxx attach output of 'pyrit list_cores' with
uncommented line
152 in _cpyrit_calpp.cpp ?
Original comment by hazema...@gmail.com
on 23 Apr 2010 at 12:33
If there is no need to recompile the code, edit the line 152 from
//self->dev_prog.disassemble(std::cout);
to
self->dev_prog.disassemble(std::cout);
pyrit list_cores
Pyrit 0.3.1-dev (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+
The following cores seem available...
#1: 'CAL++ Device #1 'ATI CYPRESS''
#2: 'CAL++ Device #2 'ATI CYPRESS''
#3: 'CPU-Core (SSE2)'
#4: 'CPU-Core (SSE2)'
#5: 'CPU-Core (SSE2)'
#6: 'CPU-Core (SSE2)'
And with
//self->dev_prog.disassemble(std::cout);
pyrit list_cores
Pyrit 0.3.1-dev (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+
The following cores seem available...
#1: 'CAL++ Device #1 'ATI CYPRESS''
#2: 'CAL++ Device #2 'ATI CYPRESS''
#3: 'CPU-Core (SSE2)'
#4: 'CPU-Core (SSE2)'
#5: 'CPU-Core (SSE2)'
#6: 'CPU-Core (SSE2)'
Original comment by odl...@gmail.com
on 23 Apr 2010 at 1:04
There is need to recompile. And trust me output will be loooong :) ( so please
post
it as attachment )
Original comment by hazema...@gmail.com
on 23 Apr 2010 at 1:08
Sorry..:-) ...next time i will post output in attachment
Original comment by odl...@gmail.com
on 23 Apr 2010 at 1:17
Ferrari at begin, then elicopter, then concorde, now rocket!
pyrit benchmark say:
HD5770 41891 PMK/s
HD5870 82552 PMK/s
Doing °real° test I got 125072 PMK, it means another +19%. It means +92%
between
"before calpp" and "after calpp", tot bad double PMK without double the
hardware.
what next week? will you make my system as ESS Enterprise with v4? :-D
Anyway I report you about my comment #83 and you answer #85 and #87: I did
another
test and YES: if I do other easy activity on another xterm (ls, mv, etc) the
PMK slow
down, it is not only a problem relate to wrong show of estimated PMK, but
really the
task is completed in more time, the PMK slow down (I verified using "time"
command),
the impression is that if one core is disturbed by other activity, then it is
not able
to go back to serve PMK and it stay unused till that task finishs and another
"clean"
task start.
The "disturbed" task slow down to 91000 PMK, the undisturbed task did 125000
PMK.
Note that I disturbed the tash about when it already did 50% of job, if I
disturbe it
early, the final PMK slould be lower than 91000.
Of course the easy solution is to not disturb the pc when it run pyrit, but I
gues
where problem can allign.
Original comment by pyrit.lo...@gmail.com
on 23 Apr 2010 at 6:37
pyrit.lover: To verify what really is the problem you could run debug version (
comment 71 ) and post here output ( undisturbed & disturbed ).
PS. Could someone with 5xxx attach output of 'pyrit list_cores' with
uncommented line
152 in _cpyrit_calpp.cpp ( for version v3 ) ?
Original comment by hazema...@gmail.com
on 23 Apr 2010 at 7:12
Attachments:
Of course run debug version from last comment :).
Original comment by hazema...@gmail.com
on 23 Apr 2010 at 7:16
pyrit.lover: I'll make v3-debug sometime tomorrow - maybe it's better to wait
for it.
Because this v2-2-debug is really something between v2-1 and v2-2 - so I'm not
sure
if the problem will occur.
Original comment by hazema...@gmail.com
on 23 Apr 2010 at 7:58
I'm attaching v3-debug version.
I really would like for someone to post ISA code on 5xxx cards from v3 version (
output of 'pyrit list_cores' with uncommented line 152 in _cpyrit_calpp.cpp ).
Without looking at this I can't start working on v4 version :).
And btw benchmarking results usually are smaller then true value ( v2 & v3
cores )
Pyrit benchmarking engine has problem with feeding gpus at the beggining and
the end
of benchmark. For standard cores this isn't a problem as they measure peak
performance anyway but v2 & v3 estimate sustained speed.
New cores could measure peak values, but it would require changes to the cores
and
I'm not really sure if it's worth it - maybe it would be better to improve pyrit
benchmarking engine.
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 5:19
Attachments:
As far as I understood, I must:
A. clean /usr/locales/lib.python2.6/site-packages.
B. recompile and install calpp-svn-1.tar.gz (from comment 102)
C. recompile and install cpyrit_calpp-v3-debug.tar.gz (from comment 114)
D. run "pyrit list_cores" and post results.
E. run "pyrit benchmark" and post results.
I ask this to avoid wrong - and unusefull - activities.
If I understood wrong, please post activity do be done point by point as A, B,
C, etc
Original comment by pyrit.lo...@gmail.com
on 25 Apr 2010 at 5:46
A. installing calpp-svn-1 - is required for >=v3 ( it needs to be installed
only once )
For posting ISA code ( this is for v3 version )
1. change line 152 in _cpyrit_calpp.cpp by removing '//' at the front of the
line
2. recompile & install
3. run 'pyrit list_cores' - in addition to cores it will output kernel ISA code
4. post output :)
For testing problem in comment 83 ( this is for v3-debug version )
1. compile & install v3-debug
2. run one batch without slowing down gpu - save output
3. run one batch with slowing gpu - save output
4. post both outputs :)
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 6:02
A. line 152 ( self->dev_prog.disassemple(std::count); ) uncommented.
B. python2.6 setup.py build ; python2.6 setup.py install
C. pyrit list_cores > report.txt
Please see attach report.txt
Problem comment 83,
1. compile & install cpyrit_calpp-v3-debug.tar.gz
Error: /use/bin/ld: cdannot find -lboost_data_time-mt
Original comment by pyrit.lo...@gmail.com
on 25 Apr 2010 at 6:56
Attachments:
report.txt has 0 bytes ?
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 6:57
Error: /use/bin/ld: cdannot find -lboost_data_time-mt - you need to install
libboost
date time component
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 6:58
Comment 42 has more info about date_time.
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 6:59
sorry, report.txt correct is here.
By the way, about Error: /use/bin/ld: cdannot find -lboost_data_time-mt and
comment
42, I was still not able to solve it.
Original comment by pyrit.lo...@gmail.com
on 25 Apr 2010 at 7:08
Attachments:
when you have installed libboost date_time component you should have in your
/usr/lib
or /usr/local/lib directory libboost-data_time files - on my system it look
this way:
-rw-r--r-- 1 root root 144246 2009-03-27 03:28 libboost_date_time-mt.a
-rw-r--r-- 1 root root 239526 2009-03-27 03:28 libboost_date_time-mt-d.a
lrwxrwxrwx 1 root root 33 2009-06-25 16:18 libboost_date_time-mt-d.so
-rw-r--r-- 1 root root 110688 2009-03-27 03:28
libboost_date_time-mt-d.so.1.37.0
lrwxrwxrwx 1 root root 31 2009-06-25 16:18 libboost_date_time-mt.so
-rw-r--r-- 1 root root 72192 2009-03-27 03:28
libboost_date_time-mt.so.1.37.0
Now in file setup.py ( line 76 ) there is
libraries = ['ssl', 'aticalrt', 'aticalcl', 'boost_date_time-mt'],
so from the file name part 'boost_date_time-mt' goes to the setup.py file.
When you have date_time component installed this name might be slightly
different (
sometimes '-' instead of '_' , or no '-mt' part ). You need to change line 76
so the
name there corresponds to the library file name.
I can't help you with installing date_time component - this much differ
depending on
the distribution you use.
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 7:17
yes! It was enough to delete "-mt" in line 76!
The problem is that now only CPU are listed by "pyrit list_cores"
doing "pyrit benckmark"(both disturbed and not disturbed) report the same PMK
for
each core, of course if GPU is not used, to dusturb PC using another xterm does
not
matter.
What i did wrong?
Original comment by pyrit.lo...@gmail.com
on 25 Apr 2010 at 8:01
The problem is that cpyrit_calpp has some problem at linking time ( during
start ).
I would guess that you didn't do ldconfig ( as root after installing libboost -
date_time ).
Command ldd can show you what is being linked at run time.
Go to the /usr/local/lib/python2.6/dist-packages/cpyrit/ and do
ldd _cpyrit_calpp.so
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 9:32
I need for someone with 5xxx to make next test :).
This procedure is for version v3
1. Overwrite file cpyrit_calpp_kernel.cpp with attached file.
2. change line 152 in _cpyrit_calpp.cpp by removing '//' at the front of the
line
3. rebuild & install
4. Post output of 'pyrit list_cores'
PS. The command from point 4 will end with error, but it will generate kernel
ISA first.
Original comment by hazema...@gmail.com
on 25 Apr 2010 at 10:30
Attachments:
Test results for new Catalyst 10.4 Linux_64 drivers using calpp-svn-1 and
cpyrit_calpp-v3:
Pyrit 0.3.1-dev (C) 2008-2010 Lukas Lueg http://pyrit.googlecode.com
This code is distributed under the GNU General Public License v3+
Running benchmark (39339.6 PMKs/s)... -
Computed 41530.27 PMKs/s total.
#1: 'CAL++ Device #1 'ATI RV770'': 20184.7 PMKs/s (RTT 2.7)
#2: 'CAL++ Device #2 'ATI RV770'': 20161.6 PMKs/s (RTT 2.7)
Not a whole lot of difference. Decided to keep my OC modest at 650 core and
stock memory.
Original comment by robert.b...@gmail.com
on 28 Apr 2010 at 11:38
I've done test from comment 125. It looks like v3 version achieved maximum
instruction packing for 5xxx cards.
I'm officially closing this issue :). Thank you all for your help.
As soon as Lukas will accept LowLatencyCore patch new version v3-2 will be
uploaded
to svn.
PS. For those interested v3-2 should be 1-2% faster than v3 :)
Original comment by hazema...@gmail.com
on 30 Apr 2010 at 1:28
This version was a big increase over the stock pyrit and cal bencmarks below x2
4870, however i am still seeing less PMKs for the dual cards than i would see
if i used them with Crossfire they usually benchmark at 22000
#1: 'CAL++ Device #1 'ATI RV770'': 15795.8 PMKs/s (RTT 1.3)
#2: 'CAL++ Device #2 'ATI RV770'': 17746.9 PMKs/s (RTT 1.2)
#3: 'CPU-Core (SSE2)': 656.8 PMKs/s (RTT 3.0)
#4: 'CPU-Core (SSE2)': 661.2 PMKs/s (RTT 2.9)
#5: 'CPU-Core (SSE2)': 662.3 PMKs/s (RTT 3.0)
#6: 'CPU-Core (SSE2)': 666.7 PMKs/s (RTT 2.6)
#7: 'CPU-Core (SSE2)': 655.4 PMKs/s (RTT 2.7)
#8: 'CPU-Core (SSE2)': 651.4 PMKs/s (RTT 2.8)
#9: 'Network-Clients': 0.0 PMKs/s (RTT 0.0)
Original comment by jezze...@gmail.com
on 11 Jul 2010 at 10:25
The problem is with HT in your system. When HT is enabled 2 threads need to
share 1 core. So ATI driver thread needs to share CPU with CPU-core computing
thread. This leads to gpu starvation and thus the reduced performance.
Original comment by hazema...@gmail.com
on 11 Jul 2010 at 11:00
You can by hand reduce number of CPU-Core threads or disable HT.
Use 'pyrit benchmark_long' for much more accurate performance estimation.
Original comment by hazema...@gmail.com
on 11 Jul 2010 at 11:05
Sorry this is an old thread, but I just got it working (Opencl actually just
locks my computer)
Ati 4670 no overclocking yet
3876.24pmks/s
Original comment by james0p0...@googlemail.com
on 2 Dec 2010 at 10:12
Original issue reported on code.google.com by
hazema...@gmail.com
on 16 Apr 2010 at 3:44Attachments: