Closed sarveshtamba closed 4 years ago
@headius any inputs on this one?
It looks to me like you'll need to update LinuxPOSIX
's section on syscalls to handle the PPC64LE ABI in order to handle the ioprio_set
case. As for the file tests, it might be that jnr-constants has different values for Linux than PPC64LE uses. I ran into this once with SPARCv9. It unfortunately never was merged due to staleness, but you could use the PR as a template for introducing a new platform to jnr-constants (if needed).
As with https://github.com/jnr/jnr-ffi/issues/200 we do not have access to a ppc64le test environment, so I've applied to get free access through a university.
Do look into helping us generate updated constants as @nirvdrum mentioned.
I have attempted to run tests on a Power8 environment. My results are a little different from yours:
Failed tests:
LinuxPOSIXTest.testMessageHdrMultipleControl:140 null
Tests in error:
FileTest.fcntlDupfdTest:298 » IO Stream Closed
FileTest.fcntlDupfdWithArgTest:320 » IO Stream Closed
LinuxPOSIXTest.ioprioThreadedTest:50 » IllegalState ioprio_set is not implemen...
I suspect the access test failure (edit: the one in your results that's not in mine) may be due to an OS-level difference between our environments so I'm not going to dig into that one at the moment (perhaps you can do so).
Ok, so I attempted to regenerate constants, and while there are a few differences none of them appear to be related to these failures.
So I have proceeded to create #145 to track fixes.
The ioprio
failure is fixed there already.
The testMessageHdrMultipleControl
test is failing because the receiving side only receives one control message, not two. This could be an environmental thing, but I am really unfamiliar with the behavior of sendmsg
and recvmsg
at this level.
The remaining two fcntl
issues indicate that either the file descriptor is not getting dup'ed properly (resulting fd appears to be closed) or the resulting dup'ed fd is not getting into a FileInputStream
successfully.
Yeah the fcntl
tests are failing to dup; the return value is -1
. Hmmm.
It appears that the behavior of F_DUPFD
is somewhat undefined when passing no third argument, and that's the cause of the fcntl
failures.
These tests pass on Linux and Darwin, but a similar piece of C code revealed some surprising behavior differences:
F_DUPFD
with two args will only work properly for non-stdio file descriptors (or at least, it did not work for any of them but did work for a newly-opened file).All documentation I can find online indicates that F_DUPFD
will use that third argument, but most docs don't say explicitly that it's required nor what happens if it is not passed.
Then I found this doc for "fcntl64" that makes it more explicit: http://www.cbs.dtu.dk/cgi-bin/nph-runsafe?man=fcntl64
fcntl() can take an optional third argument. Whether or not this argu- ment is required is determined by cmd. The required argument type is indicated in parentheses after each cmd name (in most cases, the required type is long, and we identify the argument using the name arg), or void is specified if the argument is not required.
... Duplicating a file descriptor F_DUPFD (long) Find the lowest numbered available file descriptor greater than or equal to arg and make it be a copy of fd. This is different from dup2(2), which uses exactly the descriptor specified.
On success, the new descriptor is returned. See dup(2) for further details. F_DUPFD_CLOEXEC (long; since Linux 2.6.24) As for F_DUPFD, but additionally set the close-on-exec flag for the duplicate descriptor. Specifying this flag permits a pro- gram to avoid an additional fcntl() F_SETFD operation to set the FD_CLOEXEC flag. For an explanation of why this flag is useful, see the description of O_CLOEXEC in open(2).
I think the smartest thing for these tests would be to modify them to properly use the three-arg forms of fcntl
and fix the deprecated form to also do the right thing.
With additional patches in #145 there's only the recvmsg
failure remaining. I've at least improved the error output so we can see we're only getting one control message back.
testMessageHdrMultipleControl(jnr.posix.LinuxPOSIXTest) Time elapsed: 0.073 sec <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at jnr.posix.LinuxPOSIXTest.testMessageHdrMultipleControl(LinuxPOSIXTest.java:140)
Aha, a bit of spelunking and I found that this last failing test was added by @rm5248 in #140. It was subsequently quarantined to run on Linux only (it was not passing on Darwin, but I don't recall how it failed).
Perhaps @rm5248 has some thoughts on why this would fail on Linux PPC64LE? And perhaps you can explain to me how this test manages to receive two control message headers in the first place, because I'm a bit confused about that. 😀
Ok a couple mysteries solved:
SO_PASSCRED
. This causes it to include the caller's pid, uid, and gid as a second control message.SO_PASSCRED
. This is why I had to isolate this test to run only on Linux for #140.But as far as I can tell, this test should work the same way on Linux PPC64LE, so the cause of our failure is still unknown.
Correction: SO_PASSCRED
gets set on the read side.
And the final mystery is solved. Going back to those few jnr-constants changes, this is among them:
@@ -37,8 +37,8 @@ SO_ATTACH_FILTER(0x1aL),
SO_BINDTODEVICE(0x19L),
SO_DETACH_FILTER(0x1bL),
SO_NO_CHECK(0xbL),
-SO_PASSCRED(0x10L),
-SO_PEERCRED(0x11L),
+SO_PASSCRED(0x14L),
+SO_PEERCRED(0x15L),
SO_PEERNAME(0x1cL),
SO_PRIORITY(0xcL),
SO_SECURITY_AUTHENTICATION(0x16L),
The problem here is that these values (and possibly the others that changed) differ only on PPC, but jnr-constants does not currently have the ability to separate constants by architecture.
Regenerating the constants and using the updated jnr-constants in jnr-posix allows this final test to pass.
For reference, the relevant section of the asm-generic/socket.h
headers on PPC Linux:
...
#define SO_REUSEPORT 15
#ifndef SO_PASSCRED /* powerpc only differs in these */
#define SO_PASSCRED 16
#define SO_PEERCRED 17
...
And the non-generic asm/socket.h
:
#define SO_RCVLOWAT 16
#define SO_SNDLOWAT 17
#define SO_RCVTIMEO_OLD 18
#define SO_SNDTIMEO_OLD 19
#define SO_PASSCRED 20
#define SO_PEERCRED 21
Well it's been an adventure, but we have a green build on PPC64LE. I have merged #145.
The remaining issue with SO_PASSCRED
will require fixing jnr/jnr-constants#67 and jnr/jnr-constants#68.
@sarveshtamba Please verify in your environment! Once you can confirm both jnr-ffi and jnr-posix pass tests for you I'll look at spinning some releases.
Since I was able to get a green build myself on Power8 Linux, I'm going ahead with the release of 3.0.55.
@headius thanks for looking into this quickly. I tried building v3.0.55 and the master branches, however I still see errors as below:-
readlinkPointerTest(jnr.posix.FileTest) Time elapsed: 0.013 sec <<< FAILURE!
org.junit.ComparisonFailure: expected:</tmp/jnr-p[?six-r??dl?nk-t?]st470986432865872718...> but was:</tmp/jnr-p[?six-r??dl?nk-t?]st470986432865872718...>
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at jnr.posix.FileTest.readlinkPointerTest(FileTest.java:580)
accessTest(jnr.posix.FileTest) Time elapsed: 0.003 sec <<< FAILURE!
java.lang.AssertionError: access: expected:<-1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at jnr.posix.FileTest.accessTest(FileTest.java:508)
Results :
Failed tests:
FileTest.accessTest:508 access: expected:<-1> but was:<0>
FileTest.readlinkPointerTest:580 expected:</tmp/jnr-p[?six-r??dl?nk-t?]st470986432865872718...> but was:</tmp/jnr-p[?six-r??dl?nk-t?]st470986432865872718...>
Tests run: 93, Failures: 2, Errors: 0, Skipped: 1
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 26.117 s
[INFO] Finished at: 2020-04-22T12:23:46Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project jnr-posix: There are test failures.
[ERROR]
[ERROR] Please refer to /root/jnr-posix-master/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
@sarveshtamba I have opened #146 for these additional failures. I did not see them on Fedora 29 on Power8 so I will need your help to investigate.
Trying to build
jnr-posix
v3.0.44 and v3.0.54 (REPOSITORY="https://github.com/jnr/jnr-posix.git") on ppc64le platform, however facing the following errors:-Any inputs will be highly appreciated.