daverigby / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

[patch] Add preliminary MIPS32/MIPS64 atomicops support #361

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. cd google-perftools-1.8.2/ && patch -p1 < 
../google-perftools-1.8.2-mips.patch
2. -- configure --  *ensure that you read the README_mips.txt
3. make
4. Copy the shared/static libraries and link

What is the expected output? What do you see instead?

Existing MIPS support is not available. No atomicops are present, and some of 
the system call infrastructure needs tweaking. This patch addresses these 
issues. This patch DOES NOT add full featured perftools support.

What version of the product are you using? On what operating system?

This patch is generated against 1.8.2 of the google perf tools (which is the 
latest at the time of this writing). 

Please provide any additional information below.

This patch has only been thoroughly tested on the octeon hardware platform, 
with mips32 application running under linux. Mips64 has been tested only 
insofar as the unit tests seem to pass. Your mileage may vary.

Original issue reported on code.google.com by aa...@bytheb.org on 23 Aug 2011 at 4:19

GoogleCodeExporter commented 9 years ago
Resubmitting the patch - accidentally provided a non-working 1.8.2 mips32 build.

Original comment by aa...@bytheb.org on 23 Aug 2011 at 7:15

Attachments:

GoogleCodeExporter commented 9 years ago
This is great!  Thanks for the patch.

Can you sign the CLA at
   http://code.google.com/legal/individual-cla-v1.0.html
?  I'll look to get this into the next release.

Original comment by csilv...@gmail.com on 23 Aug 2011 at 10:24

GoogleCodeExporter commented 9 years ago

Original comment by csilv...@gmail.com on 23 Aug 2011 at 10:24

GoogleCodeExporter commented 9 years ago
I've looked over this patch, and most of it looks good to me (though of course 
I can't analyze the mips assembly and the like).  A few questions:

1) I've never head of sig_rt_sigprocmask, and a google search shows no results. 
 Do you have any documentation on this?  If we're going to add an #ifdef and a 
strange symbol to linuxthreads.h, it would be great to have a bit more 
documentation.  Speaking of the #ifdef, is there any way to tell when to use it 
besides the configure-time check?  (A macro defined along with it, perhaps?)

2) It looks like HAVE_MMAP64 in malloc_hook_mmap_linux.h is new.  Is this 
something you define manually when compiling?  Why is it necessary?

3) It's not ideal folks may have to #define __MIPS64__ themselves.  Is there a 
#define that all compilers use for 64-bit mode in mips that we can take 
advantage of?  A web search shows people using __mips64__; is that more 
widespread than __MIPS64__?

4) Given that the stack-unwinding is just a stub, is there a reason to add it 
at all?  I think better would be to just modify configure.ac to say that we 
only build the 'minimal' libraries for mips (like we do right now for mingw).  
The 'minimal' build should not try to use (or even compile) the stack-tracing 
files, so it should work ok.

Everything else looks great!

Original comment by csilv...@gmail.com on 26 Aug 2011 at 11:16

GoogleCodeExporter commented 9 years ago
One more question, from the linux_syscall_support.h author:
---
I have the hardest time finding an authoritative document explaining the MIPS
calling conventions. The best I could find was some sample code in glibc. But in
either case, register 25 seems wrong. Can you get confirmation from the original
author that this is correct? And maybe a reference to where this behavior is
documented?
---

Any help you can give to document this, would be great!

Original comment by csilv...@gmail.com on 27 Aug 2011 at 12:43

GoogleCodeExporter commented 9 years ago
I found a MIPS expert to look at the atomicops changes.  Here's his feedback.  
Can you prepare a new patch with this in mind?  Thanks!
---
The documentation I find for ll and sc say in both cases:
    "This instruction implicitly performs a SYNC operation; ..."
so I think that some of your explicit barriers may be unnecessary.
This would also allow you to get rid of things like
OSAtomicCompareAndSwap32Acquire().

You shouldn't be defining names like OSAtomicCompareAndSwap32() in ways that
can be seen by clients.   The main header file does not admit to declaring them.

Aside: As an old MIPS coder, I was initially alarmed at the use of the
result immediately
after the load within a "noreorder" region, since that would violate
the pipeline rules
on the MIPS1 architecture.   However, MIPS1 has no
load-locked/store-conditional,
and I think that later architecture versions relaxed the rules.    You
might write
a comment "no delayed loads after MIPS1; no ll/sc in MIPS1",
though I suppose few people will know to be concerned in the first place.

I am suspicious of the  gcc asm arguments.
For example, on lines 63,64, I would not write:
     : "=&r" (result), "=&r" (temp), "=m" (*value)
     : "Ir" (amount), "m" (*value)
on the grounds that it is essential that the data in the input operand
not merely be
a copy of the data found at the location of the last output operand,
but must use the same location.    So I would have tried
to write the input operand as   "2" (*value)   to force it to use the
same location.
Perhaps this cannot be a problem, but the gcc documentation says:
    "Only a number in the constraint can guarantee that one operand
will be in the
    same place as another. The mere fact that foo is the value of both
operands is not
    enough to guarantee that they will be in the same place in the
generated assembler code."
---

Original comment by csilv...@gmail.com on 27 Aug 2011 at 12:46

GoogleCodeExporter commented 9 years ago
Addressing your comments:

1) Ray Balogh is awaiting approval from Sycamore Networks' legal department to 
greenlight it. The company, from engineering and management, are all on board, 
just have to dot the lower-case j's.

2) sys_rt_sigprocmask is, as I understand it, a "new" revision of the API. At 
the very least, sys_sigprocmask wasn't exposed by our C library, and 
sys_rt_sigprocmask was. I'm no posix 'nor kernel authority so take that with a 
grain of salt.
The query I used was simply "sys_rt_sigprocmask" and started going through the 
results (one from kernel.org and one from SGI). The information on old/new 
could be bogus, not 100% sure.

3) I'd thought HAVE_MMAP64 was there originally, but it looks like, from the 
patch, it wasn't. Could be an error from our original porting efforts. I'll 
remove it and re-test.

4) I tested with gnu c, built from open source, and from CodeSourcery, versions 
4.3.3, 4.1.2, and 4.1.2 after a wind-river rebrand. I couldn't even find a 
common 64-bit define. However, one of them had __MIPS64__ defined so I went 
with that. I agree, forcing people to #define their own symbols is bad - but 
haven't found any standard on it anywhere. I'll switch to lowercase __mips64__ 
if you think this is more prevalent.

5) Will modify to eliminate the need for building the stubs.

6) Register 25, from my reading of See Mips Run, and a number of gcc info 
documents, was the only one that had been reserved for general use and didn't 
result in a conflict from the compiler w.r.t. register clobbering. Originally, 
you'll see that it was reusing some register (r9 I think, which didn't happen 
in the 1.7.1 version - not sure what changed there) and the compiler 
complained. After a 2-day google binge, I found that r24 and r25 are typically 
general use temporaries which aren't preserved across calls, so using them in 
this fashion seemed ok.

First hit on google, btw, http://www.cs.clemson.edu/~mark/subroutines/mips.html 
but I also believe I saw something similar in See Mips Run (again, could be 
wrong).

7) Much of the assembly came from looking at external places where people had 
already done these routines. For instance, the linux kernel, and additionally, 
some java vm modification. The primary author of the ASM support was Ray, so 
I'll check with him.

-Aaron

Original comment by aa...@bytheb.org on 27 Aug 2011 at 1:34

GoogleCodeExporter commented 9 years ago
(1) OK.

(2) Maybe rt_sigprocmask is for realtime signals, and sigprocmask is for all 
signals?  I am basing this on a guessing-reading of 
http://www.koders.com/c/fid6E4E080FBEE796EB5090C3AC46CD57D19B31B0FC.aspx?s=queue
.  In any case, we should figure this out; I feel this change as a bit 
cargo-cult, and want to understand what's actually going on here.

(3) OK, let us know how this turns out.

(4) I have no idea, I just found __mips64__ in a web search.  I guess we can 
always fix it if someone who knows more about this comes along.

(5) Great!

(6) Are you saying that r8 (which is what was there before) *should* work 
according to the docs, but the compiler was complaining?  I'm not sure I 
understand this, but I don't understand much about asm programming at all.

(7) OK, it would be great if he could respond to the reviewer's comments, so we 
can make sure we're all on the same page.

Original comment by csilv...@gmail.com on 30 Aug 2011 at 12:15

GoogleCodeExporter commented 9 years ago
csilv, thanks a lot for the constructive comments.   I'm also on the learning 
curve, but here are my responses to the issues:

(1) Still waiting for our legal dept. to render a decision, although this 
should really be a formality.  Sorry for the delay.  There had been some 
confusion involved.

(2) According to
http://www.kernel.org/doc/man-pages/online/pages/man2/syscalls.2.html:

*  The rt_sig* calls were added in kernel 2.2 to support the addition of real-
   time signals (see signal(7)).  These system calls supersede the older
   system calls of the same name without the "rt_" prefix.

It seems that in some distributions, like ours, the corresponding older non-rt 
system calls aren't supported any more and cause a compilation errors.

(4) I think a portable test for MIPS64 may be #if _MIPS_SIM == _ABI64, but I'm 
only going by what I've seen here and there.

(6) I need to investigate this issue.  What I do know is that registers 8-15, 
24 and 25 are unsaved temporaries. (ref: 
http://en.wikipedia.org/wiki/MIPS_architecture#Compiler_register_usage)

(7) I have done a bit of asm but I'm certainly no guru.  Yes, we did rely 
heavily on existing code snippets as a starting point.

----

(8) Responding the the MIPS expert's issues, thanks, this is good feedback.  I 
need to look into some of this further.  What I can say now is:

8A) And explicit "sync" is necessary on some MIPS implementations, but not on 
the R4000.  According to glibc sys/asm.h (ref: 
http://eglibc.sourcearchive.com/documentation/2.12.1/ports_2sysdeps_2mips_2sys_2
asm_8h-source.html):

/* The MIPS archtectures do not have a uniform memory model.  Particular
   platforms may provide additional guarantees - for instance, the R4000
   LL and SC instructions implicitly perform a SYNC, and the 4K promises
   strong ordering.

   However, in the absence of those guarantees, we must assume weak ordering
   and SYNC explicitly where necessary.

   Some obsolete MIPS processors may not support the SYNC instruction.  This
   applies to "true" MIPS I processors; most of the processors which compile
   using MIPS I implement parts of MIPS II.  */

#ifndef MIPS_SYNC
# define MIPS_SYNC      sync
#endif

Original comment by ray.bal...@sycamorenet.com on 30 Aug 2011 at 5:38

GoogleCodeExporter commented 9 years ago
(2) Great, thanks for tracking that down.  Does sys_rt_sigprocmask() have the 
same ABI as sys_sigprocmask()?  If so, the #define should be safe.  But would 
it be better to just provide a syscall shim for rt_sigprocmask like we do for 
sigprocmask, and protect the sigprocmask definitions with #ifdefs against 
__NR_sigprocmask or the like?  Does this make any sense?

(4) Should we replace all the calls to __MIPS64__ with that, then?  
http://gcc.gnu.org/ml/libstdc++/2003-10/msg00151.html talks about #including 
<sgidefs.h> as well; so I added that to atomicops-internals-mips.h.  I figured 
it can't hurt.  Does that make sense?

(6) OK, let me know what you find.

(8) I don't pretend to understand any of this. :-)  Once you guys have a 
complete response to the reviewer's comments that you're happy with, possibly 
with a new version of the patch, let me know and I'll pass it along.

Original comment by csilv...@gmail.com on 30 Aug 2011 at 10:45

GoogleCodeExporter commented 9 years ago
Issue 290 has been merged into this issue.

Original comment by csilv...@gmail.com on 31 Aug 2011 at 1:14

GoogleCodeExporter commented 9 years ago
A quick update:

(1) I think the decision has been made by the powers that be to sign the CLA, 
so this should be out of the way shortly.

(2) I agree it would be cleaner to define a syscall shim for this.  The ABIs 
are different for sys_rt_sigprocmask() and sys_sigprocmask() -- the rt_ version 
takes an third argument "size_t sigsetsize" -- but the shim can take care of 
that.

(4) The discussions on this are a bit confusing to me, and they seem to get 
"religious", if you will.  What I got out of them is that at least one guy is 
recommending to *not* rely on <sgidefs.h> and instead go directly to the 
compiler-supplied _ABI* definitions, which happens to be what I suggested 
earlier.  Here is the clearest expression of this, I think:  
http://www.cygwin.com/ml/libc-alpha/2004-11/msg00054.html

This position actually makes sense to me since <sgidefs.h> is probably an 
SGI-specific file, and would not be required to be defined by other MIPS 
implementations (?).  If anybody has some insight here, please comment, because 
I'm basically guessing at this.

-----
Still need to look into the other items.

Thanks again.

Original comment by ray.bal...@sycamorenet.com on 1 Sep 2011 at 6:33

GoogleCodeExporter commented 9 years ago
OK, thanks.  I'll just sit tight and wait for an updated patch.

Original comment by csilv...@gmail.com on 1 Sep 2011 at 7:36

GoogleCodeExporter commented 9 years ago
btw, reading the thread, it looks like _MIPS_SIM is gcc-specific.  I'm not sure 
if that's a problem for us or not; I don't know what compilers folks use to 
build on mips.  I guess we can just do something that works for you and see if 
anyone else complains.

Original comment by csilv...@gmail.com on 1 Sep 2011 at 7:41

GoogleCodeExporter commented 9 years ago
Good point.  On second thought I'm going to reverse my position regarding 
<sgidefs.h>, since linux_syscall_support.h already relies heavily on those 
definitions, and I'm not in the mood to rewrite all the conditionals.  Instead, 
lets eliminate __MIPS64__ and use _MIPS_SIM == _MIPS_SIM_ABI64/ABI32 
consistently.

If someone has a MIPS system that doesn't have sigdefs.h, then there will be 
problems anyway.

Original comment by ray.bal...@sycamorenet.com on 1 Sep 2011 at 9:39

GoogleCodeExporter commented 9 years ago
BTW, I'm still working on this but it does appear that the change to the 
register usage ($25 vs $8) in linux_syscall_support.h is probably incorrect.  I 
believe that the actual problem is an asm clobber list hard-coded for the O32 
ABI, where only registers $4 through $7 contain function arguments, and 
remaining arguments need to be passed on the stack.  For N32 and N64 ABIs, 
arguments are passed in $4 through $11.  I'll sort this out.

Original comment by ray.bal...@sycamorenet.com on 1 Sep 2011 at 9:48

GoogleCodeExporter commented 9 years ago
Proposed changes for linux_syscall_support.h related to the clobber list 
follow.  Could the syscall author please comment before I pull this in?  Thanks.

$ diff src/base/linux_syscall_support.h.orig src/base/linux_syscall_support.h
1846a1847,1857
> 
>     #if _MIPS_SIM == _MIPS_SIM_ABI32
>     // See http://sources.redhat.com/ml/libc-alpha/2004-10/msg00050.html
>     // or http://www.linux-mips.org/archives/linux-mips/2004-10/msg00142.html
>     #define MIPS_SYSCALL_CLOBBERS "$1", "$3", "$8", "$9", "$10", "$11", 
"$12",\
>                                 "$13", "$14", "$15", "$24", "$25", "memory"
>     #else
>     #define MIPS_SYSCALL_CLOBBERS "$1", "$3", "$10", "$11", "$12", "$13",     
\
>                                 "$14", "$15", "$24", "$25", "memory"
>     #endif
> 
1853,1854c1864
<                                 : "$8", "$9", "$10", "$11", "$12",            
\
<                                   "$13", "$14", "$15", "$24", "memory");      
\
---
>                                 : MIPS_SYSCALL_CLOBBERS);                     
\
1912,1913c1922
<                               : "$8", "$9", "$10", "$11", "$12",              
\
<                                 "$13", "$14", "$15", "$24", "memory");        
\
---
>                               : MIPS_SYSCALL_CLOBBERS);                       
\
1953,1954c1962
<                               : "$8", "$9", "$10", "$11", "$12",              
\
<                                 "$13", "$14", "$15", "$24", "memory");        
\
---
>                               : MIPS_SYSCALL_CLOBBERS);                       
\

Original comment by ray.bal...@sycamorenet.com on 2 Sep 2011 at 8:59

GoogleCodeExporter commented 9 years ago
Thanks!  I'll have the appropriate folks take a look.

Just to be clear, this replaces the earlier change from r8 to r25?

Original comment by csilv...@gmail.com on 8 Sep 2011 at 12:47

GoogleCodeExporter commented 9 years ago
Yes, it does.  I agree that the r8 to r25 change seems incorrect, since these 
are syscall paramters and would be passed in predefined registers.  For 
function calls, the new MIPS N32 and N64 ABIs (_MIPS_SIM_ABI64) define 
additional registers r8-r11 for parameter passing beyond the old O32 ABI 
(_MIPS_SIM_ABI32) r4-r7.  I'm deducing from the referenced code that the N32/64 
syscall uses r4-r9, and O32, r4-r7.

I'm having some issues with this patch since we ratcheted our MIPS port forward 
from 1.7 to 1.8.2 in order to provide this patch to Google. The MIPS 32 bit 
support seems to be broken, although the 64 bit is ok.  The problem is in the 
syscall area, but no version of linux_syscall_support.h we have used seems to 
work, neither the original, nor r25 version, nor the r8 version.

Unfortunately, I will have to set this effort aside for a bit to work on 
something else, so please be patient.  I will get back to it as soon as 
possible, but it may be a few weeks.

Thanks again,

Original comment by ray.bal...@sycamorenet.com on 8 Sep 2011 at 2:46

GoogleCodeExporter commented 9 years ago
Sure, take your time.  I'll hold off on submitting these until all the details 
are worked out.

Original comment by csilv...@gmail.com on 8 Sep 2011 at 9:19

GoogleCodeExporter commented 9 years ago
mips32:
r2,r4,r5,r6,r7

mips64,n32 & n64:
r2,r4,r5,r6,r7,r8,r9

so, the r8 change is incorrect on all mips64 platforms (of which the n32 abi of 
mips that we were using is one). I'll rewrite a patch for Ray to test out (just 
giving an update).

Original comment by aa...@bytheb.org on 7 Oct 2011 at 6:40

GoogleCodeExporter commented 9 years ago
Checking back in on this; I just want to make sure I understand where we stand.

I've divided up the patch into three portions internally.

atomicops-internal-mips.cc: waiting for ray to respond to feedback in comment 6 
(probably with a new patch?)

cycleclock.h: That change is all ok, and just waiting for me to push it out.  
Probably this week.

linux_syscall_support.h: Looks ok to us, but it sounds like folks aren't sure 
of its correctness yet (based on comments 19 and 21).  Next step here is a new 
version of the patch, I think.

The change to malloc_hook_mmap_linux is in, but I haven't done anything with 
the changes to linuxthreads.  I think we still have to figure out the best path 
there.

Original comment by csilv...@gmail.com on 10 Oct 2011 at 10:07

GoogleCodeExporter commented 9 years ago
That sounds good. I'll wait on ray to respond if he's been able to make headway 
on this. I'm currently working in an environment without MIPS board access.

Original comment by aa...@bytheb.org on 18 Nov 2011 at 5:10

GoogleCodeExporter commented 9 years ago
Are we still waiting on Ray?  I'm planning on making a new perftools release in 
the next couple of days; if I don't hear anything soon this will have to slip 
to a future release.

Original comment by csilv...@gmail.com on 21 Dec 2011 at 7:09

GoogleCodeExporter commented 9 years ago
Sorry, but unfortunately, I'm still tied up.  In my opinion, we are definitely 
not at a point where we can declare the MIPS support to be done:

- The last time I tested out our MIPS port on the newest version at that time 
(1.8.2) I had problems with stability.

- Reviewing some of the comment above, we have not converged on a few important 
things like syscall register usage.

- Some of the work identified above that we do agree on has not been done yet.

Original comment by ray.bal...@sycamorenet.com on 22 Dec 2011 at 2:32

GoogleCodeExporter commented 9 years ago
Ok, thanks for the update.  I won't block the release for this then.  Hopefully 
we can get it for the next version!

Original comment by csilv...@gmail.com on 22 Dec 2011 at 2:36

GoogleCodeExporter commented 9 years ago
Next version is slated for the next week or two.  Let me know if you have any 
more word on this.

Original comment by csilv...@gmail.com on 25 Jan 2012 at 11:43

GoogleCodeExporter commented 9 years ago
I just want to resurrect discussion here now that google-perftools is now 
officially gperftools and is no longer affiliated with Google. So we can now 
accept the patch without any CLA. However, all code is still subject to the New 
BSD License.

http://www.opensource.org/licenses/bsd-license.php

Original comment by chapp...@gmail.com on 2 Mar 2012 at 4:33

GoogleCodeExporter commented 9 years ago
I am currently working on a rewrite of the original patch to address some of 
the reported issues, as well as take care of any licensing issues. Ray has 
agreed to test on actual MIPS hardware, but I am stuck using an emulator. I 
will keep you informed of the progress.

I am a bit more informed on the syscall register issues, and those introduce a 
layer of added compile time complexity that I will add to the MIPS read me.

Original comment by aa...@bytheb.org on 2 Mar 2012 at 4:52

GoogleCodeExporter commented 9 years ago
Just pinging for an update here.

Regards,

Dave

Original comment by chapp...@gmail.com on 21 Apr 2012 at 6:05

GoogleCodeExporter commented 9 years ago
Any further update here? After a bit of a break I am now aggressively ploughing 
through a bunch of patch work in preparation for a release at the end of the 
month and would love to have this in.

Original comment by chapp...@gmail.com on 18 Sep 2012 at 2:37

GoogleCodeExporter commented 9 years ago
Still looking for an update here. Given the effort put into this so far it 
would be a shame for it to be lost.

Original comment by chapp...@gmail.com on 28 Oct 2012 at 2:32

GoogleCodeExporter commented 9 years ago
Sorry, I no longer have access to a MIPS board to test - and I haven't
heard anything from Ray.

Original comment by aa...@bytheb.org on 28 Oct 2012 at 2:39

GoogleCodeExporter commented 9 years ago
That is unfortunate. If the status changes please let me know.

Original comment by chapp...@gmail.com on 4 Nov 2012 at 5:23

GoogleCodeExporter commented 9 years ago
I am just porting the gperftool to tilera platform , which is a multi core mips 
processor, this patch is good , although i have not compile it , my question is 
, how to compile the project ?  any one can help me ? thanks 

Original comment by wangweim...@gmail.com on 29 May 2013 at 3:54

GoogleCodeExporter commented 9 years ago
wangweimakefile, hm, my understanding is that tilera is custom architecture not 
mips. I did have shell access to one of their manycore boxes few years ago in 
fact.

In order to have tcmalloc working (in --enable-minimal mode) you'll need to 
port atomicops support and linux syscalls support. I think.

Original comment by alkondratenko on 14 Sep 2013 at 8:18

GoogleCodeExporter commented 9 years ago
Basic mips support was merged via another contribution.

Original comment by alkondratenko on 14 Sep 2013 at 8:19