SDL-Hercules-390 / hyperion

The SDL Hercules 4.x Hyperion version of the System/370, ESA/390, and z/Architecture Emulator
Other
247 stars 92 forks source link

TRTE Performance Improvement #508

Closed JamesWekel closed 2 years ago

JamesWekel commented 2 years ago

Fish,

Here is a proposed performance improvement for the TRTE instruction. I appreciate your review especially related to cross page requirements. I hope that I'm getting closer to understanding the coding requirements and style.

Before the change, the performance test reported: 1,000,000 iterations of TRTE took 4,080,617 microseconds 1,000,000 iterations of TRTE took 4,192,078 microseconds 1,000,000 iterations of TRTE took 15,322,760 microseconds 1,000,000 iterations of TRTE took 15,979,662 microseconds

and after the change: 1,000,000 iterations of TRTE took 925,913 microseconds 1,000,000 iterations of TRTE took 1,402,331 microseconds 1,000,000 iterations of TRTE took 2,292,343 microseconds 1,000,000 iterations of TRTE took 2,804,732 microseconds

The TRTE-02-performance tests the TRTE instruction with m3=12 where the FC table is 128K in length, FC is 2 bytes and an argument length of 2 bytes. This should be the worst performing TRTE instruction because of additional page boundary tests. Over three test runs, the average improvement was 78%.

Thanks for your review. I'm sure that there is something that I've missed.

Jim

JamesWekel commented 2 years ago

I'll resubmit after fixing the conflict.

Jim