ROCm / ROCm-Device-Libs

ROCm Device Libraries
97 stars 60 forks source link

Add clock functions #21

Closed guansong closed 7 years ago

b-sumner commented 7 years ago

Most of these functions can and should be implemented in C using the irif header. I doubt any user code directly calls those nvvm.barrier instrinsics but if it does we need to find out why and fix it.

guansong commented 7 years ago

I need to bring Greg's attention here and needs his opinion on this.

I put two changes in this PR, one in CL one in LL. They both come from our nvptx intrinsic wrappers. Ideally the barrier should be written in CL as well (I split the original wrapper file just to maximize the usage of CL). But the naming convention there (with a .) prevents me to do so.

Brian suggested to fix the original place of using them. I see we have lot of similar functions used that way. Do we know those are from users or compiler generated code?

b-sumner commented 7 years ago

I suppose it is possible the front end is directly generating those intrinsic calls. In that case you can still implement in C by first declaring the function and then implementing it, like this

extern void llvm_nvvm_barrier0(void) asm("llvm.nvvm.barrier0"); void __llvm_nvvm_barrier0(void) { / implementation here / }

I am not really understanding why those barrier functions you listed do not have an additional "llvm." prefix. And I have no idea what these functions are actually supposed to do. Your patch shows barrier1 has two i32 parameters. What do they do? Can you find the locations in the original source code corresponding to those calls?

guansong commented 7 years ago

I see.

How about the clock routines? Are those OK?

I will ask Greg more for the frontend part. My guess is that it maybe either Greg or Hui mapped things that way. I will double check to see the rationals behind that.

b-sumner commented 7 years ago

The clock functions are OK, although I would prefer an explicit cast when discarding the upper half of the memrealtime.

guansong commented 7 years ago

I spent some time try to further split the wrapper functions from the original file, I can see other things like

  1. workitem functions
  2. bits counting functions
  3. minum on int

are in the same situations as barrier. I will ask Greg about this in our meeting.

Looks like PR has to be closed. But, I will put a few new functions here to see how do you prefer them to be added.

guansong commented 7 years ago

Thanks. I will check that direction. I will definitely close this PR, but before that, let's get the best use of it, I like to see your opinion on one more function.

guansong commented 7 years ago

I see. Thanks!