Closed JamesWekel closed 6 months ago
Fish,
Change made. I would prefer not to rebase my CU14 branch to pick up your two minor commits. I always mess up my git rebase attempts.
Thanks for the quick review.
Jim
I would prefer not to rebase my CU14 branch to pick up your two minor commits. I always mess up my git rebase attempts.
That's fine. As I said, no biggie.
p.s. is there a reason your two tests are using mainstor 16
? Isn't that overkill? Looking at your tests, it seems to me that 4MB would probably be plenty!
Again, no big deal.
SEMI-RELATED QUESTION:
What kind of processor does your system have?
My system's processors are 2.93GHz X5570 Intel Xeons (it's an older system), and the speeds I'm getting are:
Before:
1,000,000 iterations of CU14 took 2,588,625 microseconds
1,000,000 iterations of CU14 took 2,352,916 microseconds
1,000,000 iterations of CU14 took 2,352,945 microseconds
1,000,000 iterations of CU14 took 2,413,947 microseconds
1,000,000 iterations of CU14 took 195,244,598 microseconds
After:
1,000,000 iterations of CU14 took 439,171 microseconds
1,000,000 iterations of CU14 took 511,245 microseconds
1,000,000 iterations of CU14 took 515,241 microseconds
1,000,000 iterations of CU14 took 624,051 microseconds
1,000,000 iterations of CU14 took 32,890,569 microseconds
Which is indeed an 83% performance improvement (or about 6.25 times faster! WELL DONE, James!), but notice how much slower my times are compared to yours! Your system is a good 3 times faster than mine!!
So I'm curious: What kind processors do you have and how fast are they running at??
Fish,
I do appreciate all the review, including the comments. I usually have spelling errors in comments (as they are not tested!) and I don't read them after a while.
I reduced the mainstor size to 4 for CU14-01-xpage and to 8 for CU14-02-performance (currently has a reference above the 6M boundary). I could reduce CU14-02 memory size but that would require a code change, and I'm a bit lazy.
My performance numbers were from a 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz.
Jim
Fish,
Here is a proposed performance improvement for the CU14 instruction.
Before the change, the performance test reported:
and after the change:
The performance increase ranges from 70 - 83%.
It has been a year plus since my last pull request, so I'm sure that I have missed something! I appreciate your review.
Once CU14 performance improvement is closed, I'll work on CU12 to finish Issue #101.
Jim