arielcanosa / pyrit

Automatically exported from code.google.com/p/pyrit
0 stars 0 forks source link

Nvidia Cuda vs Cal++ #268

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What would it take to speed up Nvidia cards in the same way ATI cards have been 
accelerated? Just trying to understand why the same cannot be done with a card 
which is more powerful than the ATI's and achieve better results? Have been 
googleing alot and cannot seem to find very much info about this. I have 
started a thread in the Nvidia forums as well. 

Original issue reported on code.google.com by haykey...@gmail.com on 25 Feb 2011 at 11:26

GoogleCodeExporter commented 8 years ago
nvidia has not bitwise hardware coded operation, it must be emulated by 
software.
The matter is the number of cores, the clock of the cores and the quantity of 
istructions that must be executed.
Top NVIDIA has 512 cores, medium ATI has 800 cores. Operation are about 1300, 
but thanks to bitwise, for ATI is less than 850,  got it?
NVIDIA has clock 100/200Mhz faster than ATI, yes, but it can not compensate the 
previous vantage of ATI.
So, at least under pyrit viewpoint, ATI is powerfull than NVIDIA, not the 
opposite.

Have a look to previous closed issue, hazemann post a deeper comment about this.

Original comment by pyrit.lo...@gmail.com on 26 Feb 2011 at 1:22

GoogleCodeExporter commented 8 years ago
The story so far...
Pyrit first implementation was computing by a simply cuda core, ported to more 
standard opencl core simply via copy/paste with minor modifications: their 
functionality was the same today.
Minor improvements was made, ati/nvidia cards was running the same cl core, 
until hazeman's libcal come.
He has targeted computational core to fit ati's hardware in radical way, 
redesigning data accesses, adding support to c++ in kernel, implementing 
tergeted instructions and redesigning the algo.
About kernel:
Calpp kernel computes using uint2 to do entire pmk data output in same time, 
cuda/opencl kernel does 4096 rounds x2 times to get same output;
a uint2 is part of standard opencl, so something can be changed also for 
cuda/cl kernels...
just code and try!

Original comment by masterzorag on 26 Feb 2011 at 1:48

GoogleCodeExporter commented 8 years ago
I wish I knew how to write these programs. If anyone wants to try I have no 
problem donating to the cause! Dont really want to get rid of my three water 
cooled GTX480's.... 

Original comment by haykey...@gmail.com on 26 Feb 2011 at 6:10

GoogleCodeExporter commented 8 years ago
Where would a beginner find enough info to start attempting to code this type 
of software? Pretty noobish here but do know a little basic programing.... alot 
of example programs would be nice as well... really hurts me to know there is 
all this untapped power in these GPU's

Original comment by haykey...@gmail.com on 27 Feb 2011 at 10:42

GoogleCodeExporter commented 8 years ago
I have 3x GTX 480 so almost 1500 cores and I can still only achieve about 
~90000 PMKs. Please can we get better CUDA operation? Maybe we can find an 
operation comparable to bitwise for Nvidia as well?

Original comment by haykey...@gmail.com on 5 Mar 2011 at 3:03

GoogleCodeExporter commented 8 years ago
"I can still only achieve about ~90000 PMKs"

Im going to bite my tongue on this one....

Original comment by mrfantas...@aol.com on 5 Mar 2011 at 6:13

GoogleCodeExporter commented 8 years ago
Lol I know it sounds bad (only 90000 PMKs) but theres room for big 
improvements. How can we donate to pyrity because I want to help make this 
happen!

Original comment by haykey...@gmail.com on 5 Mar 2011 at 8:12

GoogleCodeExporter commented 8 years ago
http://flattr.com/thing/104890/GPU-accelerated-attack-against-WPA-PSK-authentica
tion
 is the collector for donation.

Original comment by pyrit.lo...@gmail.com on 5 Mar 2011 at 11:35

GoogleCodeExporter commented 8 years ago
Wow there must be an easier way to donate than to have to sign up for that 
service where you MUST donate every month. Why dont you set up a paypal so 
people dont have to sign up to this flattr junk?

Original comment by haykey...@gmail.com on 5 Mar 2011 at 6:12

GoogleCodeExporter commented 8 years ago
Hmmm there is a new cuda release candidate, 4.0. Maybe this will help speed up 
nvidia cards a bit more......

Original comment by haykey...@gmail.com on 7 Mar 2011 at 11:33

GoogleCodeExporter commented 8 years ago
Nvidia GTX480 has 512 cores, so 3 Nvidia 480 = 1536 cores.
ATI HD5870 has 1600 cores. thanks to bitwise, HD5870 is able to do 82000 PMK/s.
If there is NOT bitwise in FERMI GPU of NVIDIA, there is nothing to do, GTX480 
will be never able to run as ATI does.

At the moment, NVIDIA is NOT the right hardware to bet on. Let's wait FERMI2 
and hope to have inside hardware coded bitwise istruction, it could maybe 
double the noumber of PMK, but then FERMI still have not enough cores to fight 
against ATI.

Original comment by pyrit.lo...@gmail.com on 21 Mar 2011 at 6:16

GoogleCodeExporter commented 8 years ago
And to make matter worse for NVidia users, ATI just made bitwise ( driver 11.9 
) instruction available in IL ( this should give ~10% speedup to the kernel ). 
I should make new version of CAL++ core available soon(TM).

I'm afraid that Nvidia kernel can't be improved much. Maybe 10-20% could be 
achieved by doing some optimizations made in CAL++ core ( I'm talking mostly 
about CPU side , not GPU kernel ).

Original comment by hazema...@gmail.com on 5 Oct 2011 at 4:07

GoogleCodeExporter commented 8 years ago
* bitwise=bitselect :) 

Original comment by hazema...@gmail.com on 5 Oct 2011 at 4:10

GoogleCodeExporter commented 8 years ago
"And to make matter worse for NVidia users, ATI just made...." eheheh.
I am happier day by day I DID NOT buy NVIDA, eheheh ATI is the best! (at least 
for pyrit)
Hazerman, i can't wait for youn next version of cal++ :D

Original comment by pyrit.lo...@gmail.com on 7 Oct 2011 at 1:58