riscvarchive / riscv-gcc

GNU General Public License v2.0
363 stars 274 forks source link

Instruction Scheduling Support #124

Open rishikhan opened 6 years ago

rishikhan commented 6 years ago

Does riscv-gcc support -fschedule-insns or -fschedule-insns2? The reason I ask is my applications are often accessing stuff that aren't in the cache yet (i.e. random accesses). For this reason, I tried to unroll my loop expecting to hide the latency of load. However, some of my C++ functions do something like this:

a = a | 0x1;

In this case, it is inlined, but the load and the use are next to each other. GCC doesn't seem to make any attempt to spread the load from the use like this:

load t1,a load t2,b load t3,c load t4,d ori t1,t1,0x1 ori t2,t2,0x1 ori t3,t3,0x1 ori t4,t4,0x1

Instead I get:

load t1,a ori t1,t1,0x1 load t2,b ori t2,t2,0x1 load t3,c ori t3,t3,0x1 load t4,d ori t4,t4,0x1

kito-cheng commented 6 years ago

Yes, it's supported, but I think it because there is no correct/right pipeline model to you in current riscv-gcc.

rishikhan commented 6 years ago

How do I get that?

On Mar 12, 2018 9:20 AM, "Kito Cheng" notifications@github.com wrote:

Yes, it's supported, but I think it because there is no correct pipeline model to you in current riscv-gcc.

kito-cheng commented 6 years ago

Hmmm, you need to know your CPU's pipeline and implement that in GCC[1].

[1] https://gcc.gnu.org/onlinedocs/gccint/Processor-pipeline-description.html#Processor-pipeline-description

rishikhan commented 6 years ago

This seems to be defined in gcc/config/riscv/generic.md. This would suggest that it should move the use at least 3 instructions out from the load. I don't see that happening. I'll make a small test case and attach it.