IBM / ibmichroot

A set of scripts to facilitate the use of chroot-based containers for IBM i
MIT License
21 stars 9 forks source link

fatal: Out of memory? mmap failed: No such device #30

Closed abmusse closed 7 years ago

abmusse commented 8 years ago

Original report by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


I have installed ibmichroot on a customer's machine. I created five custom chroot environments, one for each developer. Each environment is Node.js + Git.

Git worked great for a couple weeks for all developers and then two days ago it started throwing the below error message in each and every chroot environment. Including an environment that hasn't been in use since we originally created it.

fatal: Out of memory? mmap failed: No such device

I've had this issue before with other customers and it was remedied by creating chroot environments (seclusion and selection of exact binaries and libs), so now I am scratching my head on how five separate chroot environments all started getting the same Git error at the same time.

This occurs for all Git commands (i.e. git init in new folder, git status for existing repo, etc) in all the chroot environments. Further, Git is not installed outside of chroot (shouldn't make a difference, but wanted to note).

Google searches have turned up a couple things** but none of them have resolved the issue.

**

Thoughts on how to further debug this?

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


I am not coding in RPG (not a focus area of mine

No problem man. I am expert in c/C++ and RPG (and RPG free). Also good in javascript, python, php, node, ruby, perl, shell scripts, java, and many more languages you have not likely seen (i am a old dude).

Frankly, I write some things like XMLSERVICE in RPG just so RPG people feel they are not abandoned by IBM Rochester folks. Actually, I like RPG very much, but i am trained as a c/C++ developer into 400 kernel/PASE for many, many, many years.

I do enjoy this enough to make long days long enough to do something

Ok. If you open an issue in the db2sock project, i can notify via append that the SQL400Json stuff is ready to test (new toolkit).

I also want to look at the cross compiler capabilities in gcc

Oh my! I think I saw a Linux cat! (Tweety Bird cartoon).

Optional read ...

I am PASE guy, so i do everything on 400 PASE. However, my laptop(s) both work and home are Linux for over twenty years (can't hardly use Windows at all, just for Turbo Tax).

PASE gcc compile is messy ...

Yeah, yeah, we know, messy to set up 'compile environment' with perzl chroot/pkg scripts. PASE ninja turtles back at Rochester understand (*).

(*) Opinions are my own, do not reflect IBM plans or promises. See Jesse Gorzinski for IBM i Open Source plans. I am just a PASE geek Dude.

abmusse commented 7 years ago

Original comment by Chris Hird (Bitbucket: ChrisHird, GitHub: ChrisHird).


Tony

My main focus at the moment is php. I am in the process of developing a new PHP quotation system for a client so time is a little constrained, but willing to eak out as much as I can to help. I am not coding in RPG (not a focus area of mine, I can do limited RPG programming, but I am no expert!) I favor C for my development on IBM i (ILE or PASE) because most of the time I am not working on applications where RPG would give me the benefits (C is very good at the low level stuff especially the API's and pointer stuff).

I am OK with hacking the compiles as long as I know the parms etc. But if you have a script that would be very helpful (that's where I would go eventually).

I can start as soon as you need me to.. I may not be able to spend days at a time but I do enjoy this enough to make long days long enough to do something. I also want to look at the cross compiler capabilities in gcc so I have lots of things to squeeze into my long days :-)

I have a number of LPARs on my system with one dedicated to Open Source projects, chroot is there and all of the other items I need. Its kept up to date so should be a good testing ground.

Chris...

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Aaron, I populated new project itoolkit-generator with the Javascript mapping RPG/Free/Cobol to PHP toolkit.

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Chris, How much time do you have to play around with db2sock? Also when would you like to start?

The db2sock project ... at moment i have only minimum JSON working as DB2 calls to Apache w/Basic Authentication (see php tests). This will work for any of your favourite languages Node, python, php, ruby, java, even just curl. I write code in any of these languages, is php ok?

Also, to be clear the JSON will transport directly on a DB2 driver as well, but we have to modify the db2 driver for the language to call the new DB2 CLi 'semi-architecture' API SQL400Json. Aka, REST is slower, Apache and all, BUT direct driver call fast through SQL400Json (obviously). However, again, we would need to modify one of the drivers like php ibm_db2 to make the JSON calls directly. Again php ok (ibm_db2)?

I changed the makefile to build both ILE (RPG) and PASE c code from same make file in a chroot. I also put a copy of the pase driver on yips (link in project). However, may be difficult to compile the ILE parts without all the gmake stuff. Do you need a little script to compile this part without make???

Last, IF (big if), you have time soon ... I can switch back to finishing the SQL400Json calls to new toolkit. The new toolkit will look nothing like the old xmlservice. In fact 90% PASE c code with just a slim(ish) RPG stored procedure as target of SQL400Json API. However, i do not have this in the project yet (just on my pc and 400). SO ... if you have time ... i want to know when, so i can put all that stuff up into the project. Again, when?

abmusse commented 7 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Please put your standard LICENSE text file into the project only.

Done.

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Thanks Aaron. Please put your standard LICENSE text file into the project only. I will then clone and add my Javascript code and the little index.html (only two parts, maybe split someday).

abmusse commented 7 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Aaron, if you want to open a project in litmis, we could publish the Javascript code.

New itoolkit-generator repo. Tony, you have admin rights. Let me know who else needs them or add people yourself.

Did you want me to put the code in there or did you want to? I am willing, just didn't know if there were additional things to be aware of.

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Hey, another side note. I built a Javascript tool to help generate php toolkit calls from RPG source. Maybe take a look and see what you think. Yes, this is only PHP, but I am thinking of Open Sourcing the code to help build other toolkits (includin gthe new json based db2sock when finished).

Both version are anchored at XMLSERVICE -> PHP Toolkit

Old version rpg D spec only

New version cobol, rpg D spec, RPG free 2 php toolkit -- added cobol and RPG free

Aaron, if you want to open a project in litmis, we could publish the Javascript code.

Security note: Everything runs on the browser (Javascript). The customer RPG cut/paste never goes to through server. Check the new code in your browser source/debug Javascript window, you will see the html form never leaves the browser.

abmusse commented 7 years ago

Original comment by Chris Hird (Bitbucket: ChrisHird, GitHub: ChrisHird).


Sounds Good! Will attempt to get started ASAP.. Reviews will always be sensitive and considerate of the audience/possible audience. Making things better is the only goal I have :-)

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Outstanding! Your help testing, especially a healthy dose of meet performance goals (yours), would be fantastic. Please open an issue on db2sock, you can say something like checking new functions for performance and ease of use, whatever. We can carry an open conversation about what liked or not liked that to everyone's benefit.

Caution: The only thing I am not allowed to do is compare this open source project to another commercial product. All fine, if you wish to do some of that compare, even write anything you like positive or negative, but we must never put good vendors or hopefully an innocent open source project into VP level war. I am just a geek, have skills, think i cn help make a better world with IBM i. I am not a VP. Ok, said my peace brother IBM i guy, let's make PASE something better.

abmusse commented 7 years ago

Original comment by Chris Hird (Bitbucket: ChrisHird, GitHub: ChrisHird).


Tony

I would like to test and help where I can. Deleted last post after I found embedded link with information. Let me know where I can help.

Chris...

abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Thanks Aaron.

I see Kevin already PTF'd rsync. Kevin Adler is a great new face to for IBM Open Source and PASE. Incredibly talented.

Also, Jesse Gorzinski, IBM Open Source architect is doing a fantastic job trying to work a good chroot friendly path to Open Source packaging based on our yum/rpm work last year.

I hope you had time to meet with both of these guys at Common.

Unapologetic plug ... i am working on a new PASE DB2 super driver (another libdb400.a). I hope to include current db2 support (slip under), and possibly all toolkit functions JSON based. If it works (expect yes), we will have a very fast alternative to XMLSERVICE (*). I am doing in Open db2sock, so there will be no more mysteries about DB2 and PASE interactions.

(*) BTW -- I originally did XMLSERVICE as a fun RPG refresher project in my spare time. XMLSERVICE has grown way beyond original intent. I will maintatin the old XMLSERVICE. However, about time to replace it with better technology anyway.

abmusse commented 7 years ago

Original comment by Chris Hird (Bitbucket: ChrisHird, GitHub: ChrisHird).


There is another option for those who still have SNA (need Object Connect installed as well) use the SAVRST command which does a save and restore in one command. I found it to be very slow so we built our own internal product to do the same things and it uses TCP/IP (So much faster).

abmusse commented 7 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


2) We tried out rsync in past, works just fine. You can find one on Pezl, and, IBM will likely PTF one someday.

For the archives...

The rsync command arrived earlier this spring. Learn more here

Here's how I implemented it for a customer that had the mmap issue because of IFS Journaling:

$ cat /REPLICATE/replicate.sh
#!/QOpenSys/usr/bin/sh

SECONDS=0
mkdir /REPLICATE
echo "Replicating /QOpenSys/ibmichroot_spaces/git-server/repos"
rsync -a --delete /QOpenSys/ibmichroot_spaces/git-server/repos /REPLICATE

echo "Replicating /home"
rsync -a --delete /home /REPLICATE

echo "Replicating /www"
rsync -a --delete /www /REPLICATE
abmusse commented 7 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Hi fellows ... few adds from a evil PASE guy.

1) The IFS journal issue is a problem for API mmap (above), and, also the AIX API shmat.

2) We tried out rsync in past, works just fine. You can find one on Pezl, and, IBM will likely PTF one someday.

3) Speaking to "depth" of problem. Unfortunately, as Open Source products strive for "performance" related to files (IFS), they inevitably use memory mapped files. Worse, product may change implementation in a heart beat minor version. So, yes, i suspect journal IFS issue will 'pop up' in open source products 'randomly'.

abmusse commented 7 years ago

Original comment by Chris Hird (Bitbucket: ChrisHird, GitHub: ChrisHird).


Aaron

OK just read through the entire entry and see that journalling is going to be a problem for any PASE based application using mmap(). Not sure its a game breaker for most as I would hope that the majority are not going to use mmap functions against files/objects that you would want to replicate. (You should not be replicating everything, that's just bad practice).

Anyhow there are a few ways around the issue that I know of and we have used at some clients with out HA4i product.

  1. Drop back down to object level replication which is triggered by the auditing flag (this is not journalling as we know it for logical replication and if it does affect mmap you have a bigger problem as it is something most auditors are going to require it). Problem with this approach is file locking, again alternatives to ensure they do eventually get replicated but when the change notification is fired into the audit journal it could be the file is still locked by some process.
  2. Build a process around the CPY API which could copy the object to a temporary object (seems to ignore locks in IFS) and then reverse that on the target.
  3. Investigate using rsync??

So we can replicate IFS without journalling, the choices available are more than sufficient to make it a non issue in my mind.

Chris...

abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


go ask your HA product expert about exact technology implementations.

Will do.

Thanks for going deep on this one. I learned a boat-load. I plan on documenting a tutorial on what we accomplished so others can learn from it.

I am marking this issue as resolved.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


I am a PASE guy (and open source), go ask your HA product expert about exact technology implementations.

abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Every HA application potentially starts journal in IFS directories, which, may kill a PASE application due to mmap not allowed with IFS journal files.

My understanding is that PowerHA replicates at the iASP level without the need for journals (though journals could still exist for SYSBAS)*. I am belaboring the point because we (KrengelTech) are noticing a lot more HA/DR usage over the years (we do more than open source) and given Git uses mmap, and given Git is near necessity in development and deployment of PASE, well, this obviously is an issue if there are zero ways to do HA.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


The mmap() function will fail with ENOTSUP if the file is journaled.

PASE gives ENODEV.

HA

Every HA application potentially starts journal in IFS directories, which, may kill a PASE application due to mmap not allowed with IFS journal files. As far as i know, anything memory mapped file (mmap, shmat, etc.), is only prominent API failure in journal-my-IFS-world. IFS people clearly understand this issue, and, badness that is occurring in PASE kingdom (whining will not help).

These IBM i HA applications should come with a label like tobacco "use of this product on IFS files may kill your PASE application".

Welcome to IBM i, where administrators are king, everyone else is not. March, 8-9, 2016. The days Aaron became aware IFS journal files and mmap do not mix. Personally, i never remember, spend hours debugging (weeks for you), then remember to ask the client if they journal IFS files.

abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Further to my last question, I see mmap does document it doesn't work with journaling, so now I am wondering if there's a list of other APIs that have the same issue.

Also, it appears we should have been given the ENOTSUP error given the mmap docs, snippet below.

The mmap() function will fail with ENOTSUP if the file is journaled.

abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Does PowerHA for IBM i have this problem? I've looked at the documentation and it appears PowerHA does replication at a lower level than journaling, though it doesn't directly call it out from what I've seen.

Is there a list of procedures, like mmap, that don't work with journaling? I reviewed a couple redbooks and sites but am coming up empty handed.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Yep-R-doodle ... you can NOT mmap a journal file. Hilarious, we just spent a week-or-so chasing a retentive IBM i administrator. These IBM i HA applications should come with a label like tobacco "use of this product on IFS files may kill your PASE application".

abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Customer stopped replicating and now git init works as expected.

I am now asking customer about how the vendor(n1) does replication so we learn whether it is something to do with vendor's approach or the IBM i journal feature.

n1 - I will withhold the name to protect the (currently) innocent.

FWIW, I tried the below (not sure if I setup correctly) and was not able to reproduce error.

CRTJRNRCV JRNRCV(LIB1/DBX_JRN)

CRTJRN JRN(LIB1/DBX_JRN) JRNRCV(LIB1/DBX_JRN)

STRJRN OBJ(('/home/aaron/dbx_jrn' *INCLUDE)) JRN('/QSYS.LIB/lib1.lib/dbx_jrn.jrn') SUBTREE(*ALL)   

ENDJRN OBJ(('/home/aaron/dbx_jrn')) SUBTREE(*ALL)
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


May have found the culprit.... journaling.

Creation date/time . . . . . . . . . . :   08/03/16  16:16:44       
Last access date/time  . . . . . . . . :   08/03/16  16:16:44       
Data change date/time  . . . . . . . . :   08/03/16  16:16:44       
Attribute change date/time . . . . . . :   08/03/16  16:16:44  <-----

. . .

Auditing value . . . . . . . . . . . . :   *CHANGE    <----

 . . .

Object is currently journaled  . . . . :   Yes                   
  Current or last journal  . . . . . . :   A1IJRA                
    Library  . . . . . . . . . . . . . :   MYLIB             
  Journal images . . . . . . . . . . . :   *AFTER                
  Journal entries to be omitted  . . . :   *OPNCLOSYN       
  Last journal start date/time . . . . :   08/03/16  16:16:44     <---- same time as 'Attribute change date/time'
  Partial Transactions:                                          
    Apply journaled changes required . :   No                    
    Rollback was ended . . . . . . . . :   No                    
  Starting journal receiver for apply  :                         
    Library  . . . . . . . . . . . . . :                         
    ASP Device . . . . . . . . . . . . : 

I am going to run some tests on my system to see if that is the case.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Mmm ... i dunno ... may look at the IBM i side of the file ...

#!shell

WRKLNK OBJ('/QOpenSys/ranger/home/RANGER/dbxme/.git/config')

8     config                 STMF

                               Display Attributes

 Object . . . . . . :   /QOpenSys/ranger/home/RANGER/dbxme/.git/config

 Creation date/time . . . . . . . . . . :   03/08/16  14:48:11
 Last access date/time  . . . . . . . . :   03/08/16  14:48:11
 Data change date/time  . . . . . . . . :   03/08/16  14:48:11
 Attribute change date/time . . . . . . :   03/08/16  14:48:11

 Size of object data in bytes . . . . . :   92
 Allocated size of object . . . . . . . :   8192
 File format  . . . . . . . . . . . . . :   *TYPE2
 Size of extended attributes  . . . . . :   0
 Storage freed  . . . . . . . . . . . . :   No
 Temporary object . . . . . . . . . . . :   No
 Disk storage option  . . . . . . . . . :   *NORMAL
 Main storage option  . . . . . . . . . :   *NORMAL

 Auditing value . . . . . . . . . . . . :   *NONE
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


New output with sbrk mods to dbxme.py.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Nuts!!! Well, training you to debug c code. We forgot a few things in dbxme.py (below). I added sbrk, as this will include heap memory to go with our mmap memory (out of memory, could be heap or map).

#!python

import subprocess
command_line="/QOpenSys/usr/bin/dbx -d 100 /QOpenSys/usr/bin/git"
args = command_line.split()
process= subprocess.Popen(args,stdin=subprocess.PIPE,stdout=subprocess.PIPE);
while True:
  result = process.stdout.readline()
  if result.strip():
    print result.rstrip()
  if "reading symbolic" in result:
    process.stdin.write("stopi in mmap64\n")
    process.stdin.write("stopi in open\n")
    process.stdin.write("stopi in sbrk\n")
    process.stdin.write("run init\n")
  elif "stopped in open" in result:
    process.stdin.write("print (char *)$r3\n")
    process.stdin.write("return\n")
  elif "stopped in glink.sbrk" in result:
    process.stdin.write("print $r3\n")
    process.stdin.write("return\n")
  elif "stopped in mmap" in result:
    process.stdin.write("registers\n")
    process.stdin.write("return\n")
  elif "stopped in" in result:
    process.stdin.write("print $r3\n")
    process.stdin.write("0x2ff22ff8 / 4X\n")
    process.stdin.write("cont\n")
  elif "execution completed" in result:
    process.stdin.write("quit\n")
    break
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Here are the results from dbxme.py run on the machine with the Git issue.

Commentary

Also, here's the ls of the .git directory after running dbxme.py:

$ ls -all ~/gittest/.git
total 200
drwxr-sr-x    6 myuser  0             12288 Mar  8 11:58 .
drwxr-sr-x    3 myuser  0             12288 Mar  8 11:58 ..
-rw-r--r--    1 myuser  0                23 Mar  8 11:58 HEAD
drwxr-sr-x    2 myuser  0             12288 Mar  8 11:58 branches
-rw-r--r--    1 myuser  0                36 Mar  8 11:58 config
-rw-r--r--    1 myuser  0                73 Mar  8 11:58 description
drwxr-sr-x    2 myuser  0             16384 Mar  8 11:58 hooks
drwxr-sr-x    2 myuser  0             12288 Mar  8 11:58 info
drwxr-sr-x    4 myuser  0             12288 Mar  8 11:58 refs
abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


We should really try dbxme.py, because previous dump missing full registers at mmap64 fail (not all parms known). In fact, I was only "guessing" mmap file descriptor was 0xf ($r7), based on your old posts (maybe config is innocent).

abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Here's the entirety of the .git directory:

$ ls -all .git/
total 208
drwxr-sr-x    6 myuser  0             12288 Mar  4 12:26 .
drwxr-sr-x    3 myuser  0             12288 Mar  4 11:56 ..
-rw-r--r--    1 myuser  0                23 Mar  4 11:56 HEAD
drwxr-sr-x    2 myuser  0             12288 Mar  4 11:56 branches
-rw-r--r--    1 myuser  0                36 Mar  4 11:56 config
-rw-r--r--    1 myuser  0                 0 Mar  4 12:26 config.lock
-rw-r--r--    1 myuser  0                73 Mar  4 11:56 description
drwxr-sr-x    2 myuser  0             16384 Mar  4 11:56 hooks
drwxr-sr-x    2 myuser  0             12288 Mar  4 11:56 info
drwxr-sr-x    4 myuser  0             12288 Mar  4 11:56 refs

We are geeks grasshopper, means, we script things when we do not want to type (make dbx a slave)

I was about to try the same with an expect script (which I am not well versed in) so the python approach is much better. I will use that from now on.

I read through all of your last post and I believe the only question you had was the results of ls, but let me know if you want more or for me to run it again with the dbxme.py.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


We are geeks grasshopper, means, we script things when we do not want to type (make dbx a slave). I am using python 2.75, so, i think 3.4 needs parentheses around print.

#!python

import subprocess
command_line="/QOpenSys/usr/bin/dbx -d 100 /QOpenSys/usr/bin/git"
args = command_line.split()
process= subprocess.Popen(args,stdin=subprocess.PIPE,stdout=subprocess.PIPE);
while True:
  result = process.stdout.readline()
  if result.strip():
    print result.rstrip()
  if "reading symbolic" in result:
    process.stdin.write("stopi in mmap64\n")
    process.stdin.write("stopi in open\n")
    process.stdin.write("run init\n")
  elif "stopped in open" in result:
    process.stdin.write("print (char *)$r3\n")
    process.stdin.write("return\n")
  elif "stopped in mmap" in result:
    process.stdin.write("registers\n")
    process.stdin.write("cont\n")
  elif "stopped in" in result:
    process.stdin.write("print $r3\n")
    process.stdin.write("cont\n")
  elif "execution completed" in result:
    process.stdin.write("quit\n")
    break

> python dbxme.py
abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Mmm ... this is not out of memory error. Instead, is ENODEV, see below. Maybe you needed to take one more dbx>cont??? But in any event, what is up with /home/MYUSER/git_dbx/.git/config ???

#!shell

What does this look like???

ls -l /home/MYUSER/git_dbx/.git/config
#!shell

(dbx) print (char *)$r3
/home/MYUSER/git_dbx/.git/config 
(dbx) return
stopped in open64 at 0xde04137c ($t1)
0xde04137c (open64+0x3c) 60000000         ori   r0,r0,0x0
(dbx) print $r3
0x0000000f <-------------- good open (not -1) file descriptor is 15/0xf, forget errno below
(dbx) 0x2ff22ff8 / X
0x2ff22ff8:  00000016 <-- useless, no error above
(dbx) cont
[2] stopped in open at 0xde041960 ($t1)
0xde041960 (open)    7c0802a6        mflr   r0
(dbx) print (char *)$r3
/home/MYUSER/git_dbx/.git/config 
(dbx) return
stopped in open64 at 0xde04137c ($t1)
0xde04137c (open64+0x3c) 60000000         ori   r0,r0,0x0
(dbx) print $r3
0x00000010 <-------------- good open file descriptor is 16/0x10 (not -1), forget errno below
(dbx) 0x2ff22ff8 / X
0x2ff22ff8:  00000016 <-- useless, no error above
(dbx) cont
[1] stopped in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) print (char *)$r3
(nil) <--- we needed all registers here, but "guessing" from previous post...
  $r0:0xde2555e0  $stkp:0x2ff22520   $toc:0xf193ea50    $r3:0x00000000  
  $r4:0x00000024    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f  
 $r8:0x00000000    $r9:0x00000000   $r10:0x04131000   $r11:0x04131f30  
void *mmap64 (
($r3) addr = 0x0 (print $r3 = nil), 
($r4) len = 24, 
($r5) prot = 1, (PROT_READ)
($r6) flags = 2, (MAP_PRIVATE)
($r7) fildes = f, <-- file descriptor 0xf ... /home/MYUSER/git_dbx/.git/config (above)
($r8) off 0)
(dbx) return
stopped in . at 0x10045554 ($t1)
0x10045554 (???) 80410014         lwz   r2,0x14(r1)
(dbx) print $r3
0xffffffff <--- mmap failed  ...  
(dbx) 0x2ff22ff8 / X
0x2ff22ff8:  00000013 <-- yes error above ... ENODEV 19 - No such device
(dbx)

bash-4.3$ grep 19 /usr/include/errno.h 
#define ENODEV  19      /* No such device                       */
ENODEV The fildes parameter refers to an object that cannot be mapped, such as a terminal.
bash-4.3$ grep MAP_ /usr/include/sys/mman.h      
#define MAP_SHARED      0x1             /* share changes */
#define MAP_PRIVATE     0x2             /* changes are private */
#define MAP_FIXED       0x100           /* map addr must be exactly as specified */
#define MAP_VARIABLE    0x00            /* system can place new region */
#define MAP_FAILED      ((void *)-1)
#define MAP_FILE        0x00            /* map from a file */
#define MAP_ANONYMOUS   0x10            /* map an unnamed region */
#define MAP_ANON        0x10            /* map an unnamed region */
#define MAP_TYPE        0xf0            /* the type of the region */
bash-4.3$ grep PROT_ /usr/include/sys/mman.h
#define PROT_NONE       0               /* no access to these pages */
#define PROT_READ       0x1             /* pages can be read */
#define PROT_WRITE      0x2             /* pages can be written */
#define PROT_EXEC       0x4             /* pages can be executed */
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


I believe it is right here. Specifically, /home/MYUSER/git_dbx/.git/config.

What am I missing?

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


errno after 'return' from mmap64, though I assume it relates to previous open attempt?

You did not include the file opened information in your cut/paste. I am assuming the following sequence is needed before we can speculate ...

#!shell

dbx> cont
stop in open
dbx> print $r3
/this/file/caused/issue/with/mmap64 <--- missing from your cut/paste
dbx> cont
stop in mapp64
dbx> return
dbx> print $r3
-1 of 0xffffffff
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Ok, I ran it again and did the addtl print (char *)$r3 etc stuff. The log was big so I created a snippet(click here).

The final errno is 13 (permission denied), though leading up to it we have the following...

0x2ff22ff8:  00000002  <-- "not found" (expected)
0x2ff22ff8:  00000011  <-- "try again" (not sure if expected, may be beginning of issue)
0x2ff22ff8:  00000016  <-- "resource busy" (happens on .git/config and .git/config.lock
0x2ff22ff8:  00000013  <-- "permission denied" (errno after 'return' from mmap64, though I assume it relates to previous open attempt?)

Going down the 'permission denied' hole, here's the current state of the .git directory.

bash-4.3$ ls -all .git/
total 208
drwxr-sr-x    6 myuser  0             12288 Mar  4 12:26 .
drwxr-sr-x    3 myuser  0             12288 Mar  4 11:56 ..
-rw-r--r--    1 myuser  0                23 Mar  4 11:56 HEAD
drwxr-sr-x    2 myuser  0             12288 Mar  4 11:56 branches
-rw-r--r--    1 myuser  0                36 Mar  4 11:56 config
-rw-r--r--    1 myuser  0                 0 Mar  4 12:26 config.lock
-rw-r--r--    1 myuser  0                73 Mar  4 11:56 description
drwxr-sr-x    2 myuser  0             16384 Mar  4 11:56 hooks
drwxr-sr-x    2 myuser  0             12288 Mar  4 11:56 info
drwxr-sr-x    4 myuser  0             12288 Mar  4 11:56 refs

Both config and config.lock are rw- for the owner. Same permissions exist on a machine where Git works so I wonder if "permission denied" is a misnomer.

Guess: The "resource busy" is because one Git "thread" created config.lock and a subsequent "thread" is trying to gain access to it? Or in short, a race condition?

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


As long as we are re-doing, let's gather some more information about what is going on, find out the name of the file causing the problem.

#!shell

bash-4.3$ dbx -d 100 /QOpenSys/usr/bin/git
Type 'help' for help.
reading symbolic information ...
(dbx) stopi in open  <--- add a stop in open
[1] stopi in open
(dbx) stopi in mmap64 <-- also our failure location 
[2] stopi in mmap64
(dbx) run init
[1] stopped in open at 0xd52fe6e0 ($t1)
0xd52fe6e0 (open)    7c0802a6        mflr   r0
(dbx) print (char *)$r3  <--- register 3 has the name of file to be open
/unix 
(dbx) return
stopped in open64 at 0xd52fe0fc ($t1)
0xd52fe0fc (open64+0x3c) 60000000         ori   r0,r0,0x0
(dbx) print $r3 <--- register 3 has the return code -1 (0xffffffff), failed, but we are not done
0xffffffff 
(dbx) 0x2ff22ff8 / X <-- display errno hex (convert decimal and see /usr/include/errno.h)
0x2ff22ff8:  00000002
(dbx) cont <--- next open file, aka, we did not fail mmap64, so keep going ... and going ... until mmap64 fails
[1] stopped in open at 0xd52fe6e0 ($t1)
0xd52fe6e0 (open)    7c0802a6        mflr   r0
(dbx) print (char *)$r3 <--- name of the file to be open ... so on
/dev/null 
(dbx) return
stopped in open64 at 0xd52fe0fc ($t1)
0xd52fe0fc (open64+0x3c) 60000000         ori   r0,r0,0x0
(dbx) print $r3        
0x0000000e 
(dbx) cont
[1] stopped in open at 0xd52fe6e0 ($t1)
0xd52fe6e0 (open)    7c0802a6        mflr   r0
(dbx) print (char *)$r3
/opt/freeware/lib/charset.alias 
(dbx) return
stopped in localcharset.get_charset_aliases [/opt/freeware/lib/libiconv.a] at 0xd5809a60 ($t1)
0xd5809a60 (get_charset_aliases+0x108) 80410014         lwz   r2,0x14(r1)
(dbx) print $r3                
0xffffffff 
(dbx) 0x2ff22ff8 / X
0x2ff22ff8:  00000002
(dbx)
abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


You failed 0xffffffff. but, you did not dump the errno.

#!shell

(dbx) return
stopped in . at 0x10045554 ($t1)
0x10045554 (???) 80410014         lwz   r2,0x14(r1)
(dbx) registers
  $r0:0x00003608  $stkp:0x2ff22520   $toc:0xf193ea50    $r3:0xffffffff    <----   -1 right out of the gates, though no "Out of memory" error (yet)
  $r4:0x00000024    $r5:0x00000017    $r6:0x3003d5f8    $r7:0x073939d8  
  $r8:0x80558000    $r9:0x86203089   $r10:0x09a5e000   $r11:0x09a5ef30  
 $r12:0x10045554   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x3003d1f0   $r27:0x00000000  
 $r28:0x101a8f78   $r29:0x3003d1f0   $r30:0x30009be8   $r31:0x00000024  
 $iar:0x10045554   $msr:0x0002f032    $cr:0x86203089  $link:0x10045554  
 $ctr:0xffffffff   $xer:0xf4ffffff    $mq:0x00000000  
          Condition status = 0:l 1:ge 2:e 4:eo 6:l 7:lo 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in . at 0x10045554 ($t1)
0x10045554 (???) 80410014         lwz   r2,0x14(r1)

=====
like this
=====

dbx) 0x2ff22ff8 / X
xxxxxxxxx <- errno in hex 
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Ok, I've updated the previous comment to include return in the instructions.

For others following along you can read this SO to understand 0xffffffff.

Here's the new log that includes invoking return:

$ dbx -d 100 /opt/freeware/bin/git
Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

(dbx) stopi in mmap64
[1] stopi in mmap64
(dbx) run init
[1] stopped in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) return
stopped in . at 0x10045554 ($t1)
0x10045554 (???) 80410014         lwz   r2,0x14(r1)
(dbx) registers
  $r0:0x00003608  $stkp:0x2ff22520   $toc:0xf193ea50    $r3:0xffffffff    <----   -1 right out of the gates, though no "Out of memory" error (yet)
  $r4:0x00000024    $r5:0x00000017    $r6:0x3003d5f8    $r7:0x073939d8  
  $r8:0x80558000    $r9:0x86203089   $r10:0x09a5e000   $r11:0x09a5ef30  
 $r12:0x10045554   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x3003d1f0   $r27:0x00000000  
 $r28:0x101a8f78   $r29:0x3003d1f0   $r30:0x30009be8   $r31:0x00000024  
 $iar:0x10045554   $msr:0x0002f032    $cr:0x86203089  $link:0x10045554  
 $ctr:0xffffffff   $xer:0xf4ffffff    $mq:0x00000000  
          Condition status = 0:l 1:ge 2:e 4:eo 6:l 7:lo 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in . at 0x10045554 ($t1)
0x10045554 (???) 80410014         lwz   r2,0x14(r1)
(dbx) cont
[1] stopped in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) return
stopped in . at 0x100455a0 ($t1)
0x100455a0 (???) 80410014         lwz   r2,0x14(r1)
(dbx) registers
  $r0:0x00003608  $stkp:0x2ff22520   $toc:0xf193ea50    $r3:0xffffffff  
  $r4:0x00000024    $r5:0x00000017    $r6:0x3003d5f8    $r7:0x073992f0  
  $r8:0x80556000    $r9:0x33203089   $r10:0x09a5e000   $r11:0x09a5ef30  
 $r12:0x100455a0   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x3003d1f0   $r27:0x00000000  
 $r28:0x101a8f78   $r29:0xffffffff   $r30:0xffffffff   $r31:0x00000024  
 $iar:0x100455a0   $msr:0x0002f032    $cr:0x33203089  $link:0x100455a0  
 $ctr:0xffffffff   $xer:0xf4ffffff    $mq:0x00000000  
          Condition status = 0:eo 1:eo 2:e 4:eo 6:l 7:lo 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in . at 0x100455a0 ($t1)
0x100455a0 (???) 80410014         lwz   r2,0x14(r1)
(dbx) cont
fatal: Out of memory? mmap failed: No such device

execution completed (exit code 128)
(dbx) return
cannot continue execution
(dbx) registers
  $r0:0x00000000  $stkp:0x2ff21f10   $toc:0xf193ea50    $r3:0x00000080  
  $r4:0x300356e8    $r5:0x00000008    $r6:0xffff8006    $r7:0x00000000  
  $r8:0x10546033    $r9:0x10546033   $r10:0x09a5e000   $r11:0x00000000  
 $r12:0xde08783c   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x00000000   $r27:0x300064fc  
 $r28:0x00000000   $r29:0xf18d8bc8   $r30:0xf19609a8   $r31:0xffffffff  
 $iar:0xde090750   $msr:0x0002f032    $cr:0x28203086  $link:0xde087848  
 $ctr:0xd62c2f00   $xer:0x04000002    $mq:0x00000000  
          Condition status = 0:e 1:l 2:e 4:eo 6:l 7:ge 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in . at 0xde090750 ($t1)
0xde090750 (_exit)    81820b4c         lwz   r12,0xb4c(r2)
(dbx)
abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Almost c-code student, but, you are stopping at entrance to mmap64, aka, nothing interesting in registers until exit of mmap64 (after it runs). You need to add the dbx command return (se below). BTW -- my stepi instructions until back to caller also work, but are much less elegant than dbx return.

#!shell
[adc@oc7083008330 ~]$ ssh -X ranger@lp0364d
Welcome to LP0364D.rchland.ibm.com
$ bash
bash-4.3$ which git
/QOpenSys/usr/bin/git
bash-4.3$ dbx -d 100 /QOpenSys/usr/bin/git
Type 'help' for help.
reading symbolic information ...
(dbx) stopi in mmap64
[1] stopi in mmap64
(dbx) run init
[1] stopped in mmap64 at 0xd5512360 ($t1)
0xd5512360 (mmap64)    7c0802a6        mflr   r0
(dbx) return <--- need to actually run mmap64, then stop back in caller
stopped in . at 0x10035d40 ($t1)
0x10035d40 (???) 80410014         lwz   r2,0x14(r1)
(dbx) registers
  $r0:0x00003608  $stkp:0x2ff22560   $toc:0xf11ff7d0    $r3:0xb0000000  <-- mmap64 address
  $r4:0x00000024    $r5:0x00000000    $r6:0x00000000    $r7:0x00000008  
  $r8:0x80557000    $r9:0x22203089   $r10:0x08ef4000   $r11:0x08ef4f30  
 $r12:0x10035d40   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d48  
 $r16:0x30043c10   $r17:0x10187dec   $r18:0x00000005   $r19:0x10187e64  
 $r20:0x10187e5c   $r21:0x2ff229a0   $r22:0x10187e3c   $r23:0x10187e48  
 $r24:0x00000000   $r25:0x00000024   $r26:0x300434f0   $r27:0x00000000  
 $r28:0x10187ce8   $r29:0x300434f0   $r30:0x3000cbd0   $r31:0x00000024  
 $iar:0x10035d40   $msr:0x0002f032    $cr:0x22203089  $link:0x10035d40  
 $ctr:0xb0000000   $xer:0xb4000000    $mq:0x00000000  
          Condition status = 0:e 1:e 2:e 4:eo 6:l 7:lo 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in . at 0x10035d40 ($t1)
0x10035d40 (???) 80410014         lwz   r2,0x14(r1)
(dbx)
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Below is a dbx session of the failed git attempt.

Above you focused on register $r3 but it appears $r29 and $r30 are where it falls apart for me. How does one know which register is the return code for mmap64?

Also, do you have a good resource for me to read that would give me more insight as to why I am looking for particular things? I am reading the AIX docs but those more tell me how to use the commands and not what to watch for concerning issues. A lot of the other dbx docs are Oracle based and I am hesitant to pursue those because I don't know whether they are in the same vein as AIX or not.

$ mkdir git_dbx
$ cd git_dbx/
$ export PATH=/opt/freeware/bin:$PATH
$ export LIBPATH=/opt/freeware/lib
$ dbx -d 100 /opt/freeware/bin/git
Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

(dbx) stopi in mmap64
[1] stopi in mmap64
(dbx) run init
[1] stopped in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) registers
  $r0:0xde2555e0  $stkp:0x2ff22520   $toc:0xf193ea50    $r3:0x00000000  
  $r4:0x00000024    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f  
  $r8:0x00000000    $r9:0x00000000   $r10:0x04131000   $r11:0x04131f30  
 $r12:0xf193aecc   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x3003d1f0   $r27:0x00000000  
 $r28:0x101a8f78   $r29:0x3003d1f0   $r30:0x30009be8   $r31:0x00000024  
 $iar:0xde2555e0   $msr:0x0002f032    $cr:0x84203089  $link:0x10045554  
 $ctr:0xde2555e0   $xer:0x04000000    $mq:0x00000000  
          Condition status = 0:l 1:g 2:e 4:eo 6:l 7:lo 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) cont
[1] stopped in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) registers
  $r0:0xde2555e0  $stkp:0x2ff22520   $toc:0xf193ea50    $r3:0x00000000  
  $r4:0x00000024    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f  
  $r8:0x00000000    $r9:0x00000000   $r10:0x04131000   $r11:0x04131f30  
 $r12:0xf193aecc   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x3003d1f0   $r27:0x00000000  
 $r28:0x101a8f78   $r29:0xffffffff   $r30:0xffffffff   $r31:0x00000024              <---------- Falls apart?
 $iar:0xde2555e0   $msr:0x0002f032    $cr:0x33203089  $link:0x100455a0  
 $ctr:0xde2555e0   $xer:0xf4ffffff    $mq:0x00000000  
          Condition status = 0:eo 1:eo 2:e 4:eo 6:l 7:lo 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0xde2555e0 ($t1)
0xde2555e0 (mmap64)    7c0802a6        mflr   r0
(dbx) cont
fatal: Out of memory? mmap failed: No such device

execution completed (exit code 128)
(dbx) registers
  $r0:0x00000000  $stkp:0x2ff21f10   $toc:0xf193ea50    $r3:0x00000080  
  $r4:0x300356e8    $r5:0x00000008    $r6:0xffff8006    $r7:0x00000000  
  $r8:0x105aa04f    $r9:0x105aa04f   $r10:0x04131000   $r11:0x00000000  
 $r12:0xde08783c   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d18  
 $r16:0x3003d230   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4  
 $r20:0x101a90ec   $r21:0x2ff22960   $r22:0x101a90cc   $r23:0x101a90d8  
 $r24:0x00000000   $r25:0x00000024   $r26:0x00000000   $r27:0x300064fc  
 $r28:0x00000000   $r29:0xf18d8bc8   $r30:0xf19609a8   $r31:0xffffffff  
 $iar:0xde090750   $msr:0x0002f032    $cr:0x28203086  $link:0xde087848  
 $ctr:0xd62c2f00   $xer:0x04000002    $mq:0x00000000  
          Condition status = 0:e 1:l 2:e 4:eo 6:l 7:ge 
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0xde090750 ($t1)
0xde090750 (_exit)    81820b4c         lwz   r12,0xb4c(r2)
(dbx) 
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Sounds good. I've updated the previous post.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


You need registers every stop, so we can see where/when starts to go wrong. No short cuts grasshopper, and, remember to write down the addresses returned from mmap until 0xffffffff is returned (-1).

#!shell

dbx>cont
dbx>registers
dbx>cont
dbx>registers
... so on ...
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


I've updated my previous post to include registers at the end.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Reminder -- we are looking for why git throws the error 'out of memory', so remember to record $r3 mmap64 locations. We want to see if git chews up all the memory ...

#!shell

(dbx) registers
  $r0:0x00003608  $stkp:0x2ff22b80   $toc:0x00415e7d    $r3:0x30000000  <- $r3 where mapped file, 0xffffffff fail
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Now we're cooking with oil. Below is what I run for commands.

export PATH=/opt/freeware/bin:$PATH
export LIBPATH=/opt/freeware/lib
dbx -d 100 /opt/freeware/bin/git
stopi in mmap64
run init
return     <------ Loop through return, registers, cont commands until error occurs.  
registers
cont
print &errno        <----- now get errno and use the resulting value on next line
0x2ff22ff8 / 10X      <-- dump errno, 1st four hex bytes error number in /usr/include/errnoh.h)

Here's what the full session looks like.

$ ssh -o ServerAliveInterval=5 aaron@ibmi
% export PATH=/opt/freeware/bin:$PATH
export LIBPATH=/opt/freeware/lib
dbx -d 100 /opt/freeware/bin/git
Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

(dbx) stopi in mmap64
[1] stopi in mmap64
(dbx) run init
[1] stopped in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) registers
  $r0:0x20280400  $stkp:0x2ff22530   $toc:0x207aba88    $r3:0x00000000
  $r4:0x00000024    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f
  $r8:0x00000000    $r9:0x00000000   $r10:0x078f8000   $r11:0x078f8f30
 $r12:0x207a7a3c   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d20
 $r16:0x3003c030   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4
 $r20:0x101a90ec   $r21:0x2ff22970   $r22:0x101a90cc   $r23:0x101a90d8
 $r24:0x00000000   $r25:0x00000024   $r26:0x3003bff0   $r27:0x00000000
 $r28:0x101a8f78   $r29:0x3003bff0   $r30:0x30009be8   $r31:0x00000024
 $iar:0x20280400   $msr:0x0002f032    $cr:0x84203088  $link:0x10045554
 $ctr:0x20280400   $xer:0x04000000
          Condition status = 0:l 1:g 2:e 4:eo 6:l 7:l
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) cont
[1] stopped in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) registers
  $r0:0x20280400  $stkp:0x2ff22530   $toc:0x207aba88    $r3:0x00000000
  $r4:0x00000035    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f
  $r8:0x00000000    $r9:0x00000000   $r10:0x078f8000   $r11:0x078f8f30
 $r12:0x207a7a3c   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d20
 $r16:0x3003c1d0   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4
 $r20:0x101a90ec   $r21:0x2ff22970   $r22:0x101a90cc   $r23:0x101a90d8
 $r24:0x00000000   $r25:0x00000035   $r26:0x3003bff0   $r27:0x00000000
 $r28:0x101a8f80   $r29:0x3003bff0   $r30:0x30009be8   $r31:0x00000035
 $iar:0x20280400   $msr:0x0002f032    $cr:0x86203088  $link:0x10045554
 $ctr:0x20280400   $xer:0x04000000
          Condition status = 0:l 1:ge 2:e 4:eo 6:l 7:l
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) cont
[1] stopped in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) registers
  $r0:0x20280400  $stkp:0x2ff22530   $toc:0x207aba88    $r3:0x00000000
  $r4:0x00000043    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f
  $r8:0x00000000    $r9:0x00000000   $r10:0x078f8000   $r11:0x078f8f30
 $r12:0x207a7a3c   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d20
 $r16:0x3003c650   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4
 $r20:0x101a90ec   $r21:0x2ff22970   $r22:0x101a90cc   $r23:0x101a90d8
 $r24:0x00000000   $r25:0x00000043   $r26:0x3003c610   $r27:0x00000000
 $r28:0x101a8f78   $r29:0x3003c610   $r30:0x30009be8   $r31:0x00000043
 $iar:0x20280400   $msr:0x0002f032    $cr:0x86203088  $link:0x10045554
 $ctr:0x20280400   $xer:0x04000000
          Condition status = 0:l 1:ge 2:e 4:eo 6:l 7:l
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) cont
[1] stopped in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) registers
  $r0:0x20280400  $stkp:0x2ff22530   $toc:0x207aba88    $r3:0x00000000
  $r4:0x0000005c    $r5:0x00000001    $r6:0x00000002    $r7:0x0000000f
  $r8:0x00000000    $r9:0x00000000   $r10:0x078f8000   $r11:0x078f8f30
 $r12:0x207a7a3c   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d20
 $r16:0x3003c610   $r17:0x101a907c   $r18:0x00000005   $r19:0x101a90f4
 $r20:0x101a90ec   $r21:0x2ff22970   $r22:0x101a90cc   $r23:0x101a90d8
 $r24:0x00000000   $r25:0x0000005c   $r26:0x3003bff0   $r27:0x00000000
 $r28:0x101a8f78   $r29:0x3003bff0   $r30:0x30009be8   $r31:0x0000005c
 $iar:0x20280400   $msr:0x0002f032    $cr:0x86203088  $link:0x10045554
 $ctr:0x20280400   $xer:0x04000000
          Condition status = 0:l 1:ge 2:e 4:eo 6:l 7:l
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0x20280400 ($t1)
0x20280400 (mmap64)    7c0802a6        mflr   r0
(dbx) cont
Initialized empty Git repository in /home/aaron/git_dbx/.git/

execution completed
(dbx) registers
  $r0:0x00000000  $stkp:0x2ff22b00   $toc:0x207aba88    $r3:0x00000000
  $r4:0x30034384    $r5:0x00000008    $r6:0x00000006    $r7:0x00000000
  $r8:0x100ce013    $r9:0x100ce013   $r10:0x078f8000   $r11:0x00000000
 $r12:0x2007c3bc   $r13:0xdeadbeef   $r14:0x00000002   $r15:0x2ff22d20
 $r16:0x2ff22d2c   $r17:0x00000000   $r18:0xdeadbeef   $r19:0xdeadbeef
 $r20:0xdeadbeef   $r21:0xdeadbeef   $r22:0xdeadbeef   $r23:0xdeadbeef
 $r24:0xdeadbeef   $r25:0x2073a1c0   $r26:0x20739f20   $r27:0x00000000
 $r28:0x00000000   $r29:0x300064fc   $r30:0x207ccf54   $r31:0xffffffff
 $iar:0x200828a8   $msr:0x0002f032    $cr:0x28200086  $link:0x2007c3c8
 $ctr:0x20425d00   $xer:0x04000002
          Condition status = 0:e 1:l 2:e 6:l 7:ge
        [unset $noflregs to view floating point registers]
        [unset $novregs to view vector registers]
in mmap64 at 0x200828a8 ($t1)
0x200828a8 (_exit)    81820948         lwz   r12,0x948(r2)

I will now have this run on the machine where Git isn't working. Stay tuned.

abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


Uf Da!!!! Silly AIX OFF_MAX large file workarounds bite average user (again), try stopi in mmap64 (below). Yes, misleading, because your git program is still 32 bit ... argh ... AIX OFF_MAX monkey business.

#!shell

bash-4.3$ dbx /QOpenSys/usr/bin/git      
Type 'help' for help.
reading symbolic information ...
(dbx) stopi in mmap64
[1] stopi in mmap64
(dbx) where
__start() at 0x10000128
(dbx) 
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


See below line with "<-------" in it.

% echo $PATH
/opt/freeware/bin:/QOpenSys/usr/bin:/usr/ccs/bin:/QOpenSys/usr/bin/X11:/usr/sbin:.:/usr/bin:/home/AARON/bin
% echo $LIBPATH
/opt/freeware/lib
% dbx -d 100 /opt/freeware/bin/git
Type 'help' for help.
reading symbolic information ...warning: no source compiled with -g

(dbx) stopi in mmap
"mmap" is not a subprogram   <------- Guessing this is a problem.  Have I set my LIBPATH correctly?
(dbx) run init
Initialized empty Git repository in /home/aaron/git_dbx/.git/

execution completed
(dbx)
abmusse commented 8 years ago

Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).


No, no, no, bad grasshopper, you give up to soon.

Further to previous post, it appears dbx needs git to be compiled with the -g option

Forget -g, is ONLY source level debug, aka, wimps that need source code debugging. We are teaching 'real man' skills here, no bloody source c code, only binary level assembler debugging.

cannot read git

You need the full path to git (relative will not work), again, manly debugging grasshopper. Also make very sure the LIBPATH is set correctly, because these objects actually are relative.

#!shell

dbx -d 100 /opt/freeware/bin/git
... so on ...
abmusse commented 8 years ago

Original comment by Aaron Bartell (Bitbucket: aaronbartell, GitHub: aaronbartell).


Further to previous post, it appears dbx needs git to be compiled with the -g option. I reviewed Perzl's build docs and it doesn't appear he uses it.