Open madscientist159 opened 6 years ago
@Spudz76 Leaving the fact that the MacOS hack is illegal aside, I'll put this challenge to you: show me any arbitrary code execution on the AMD PSP that would allow replacing the PSP blob with an open firmware. This is a whole new breed of DRM that is, bluntly, going to be undefeatable. That again is a relatively new concept that will take people time to understand, that no, there is no legal or even feasible way to unlock or free a consumer device protected with this new technology. Dundamentally, we lack the ability to modify CPUs that were already manufactured at the silicon level, and that's what would be required here. :smile:
Also bear in mind markets adjust for minimum perceived cost, not optimal feature set, even where said feature set would generally lead to an improved society. They operate on cost alone with some time window in the range of decades, which is starting to break down when starting a new manufacturing process from first principles would take more time than the market "sliding window" used to adjust the types of products manufactured.
Does this work? Apparently they woke up and updated, and it includes an int_sqrt33_1_double_precision_fast
that is apparently... fast?
Oh, it's just a LUT hack, of course that's fast, I wonder how huge the binary gets though.
I don't actually think that the Power CPUs are any slower at the actual algo than Intel/AMD CPUs.
https://en.wikichip.org/wiki/ibm/microarchitectures/power9
It seems that one base model has 12 cores but have 120MB L3 in the form of fast eDRAM. So the main reason they perform better than Intel/AMD is that their cores are fed by 4-5 times more L3 per core.
In fact, the Crystalwell CPUs (for which xmr-stak's 5x algos were first developed) can also do >100h/s per core on CNv0 and CNv1 despite being 2013 CPUs, and they do this by means of a 128MB L4 eDRAM cache.
This is exactly what CNV2 was intended to defeat, specialized devices such as ASICs and FPGAs with more fast memory than compute cores. As a result, it's only expected that the cache-rich Power CPUs are being penalized.
In fact, if you run x5 scratchpad algo on a small subset of cores on a high-core Xeon, you can also achieve hashrates of >150h/s on CNv0 and CNv1.
There is no official support for ppc64el systems. A port was already made for CNv7 (https://github.com/nioroso-x3/xmr-stak) but CNv8 appears to require brand new assembly support routines.