whitwham / tears

Stream files to and from iRODS
GNU General Public License v3.0
1 stars 5 forks source link

Possible regression vs iRODS 4.1.12 #10

Closed kript closed 5 years ago

kript commented 6 years ago

Test harness passes in 4.1.10 and fails in 4.1.11 (I think) and 4.1.12

#expanded from the test harness for readability
jc18@farm3-head4:/tmp$ SMALL_FILE=$(mktemp irods_test_small_XXX)
jc18@farm3-head4:/tmp$ dd if=/dev/zero of="${SMALL_FILE}" bs=1k count=1
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 4.4354e-05 s, 23.1 MB/s
jc18@farm3-head4:/tmp$ ls -lah "${SMALL_FILE}"
-rw------- 1 jc18 team94 1.0K Jul 20 13:35 irods_test_small_jra
jc18@farm3-head4:/tmp$ irods_cwd=$(ipwd)
jc18@farm3-head4:/tmp$ echo $irods_cwd
/seq-dev/home/jc18#Sanger1-dev
jc18@farm3-head4:/tmp$ /software/npg/20180702/bin/tears -f -w  "${irods_cwd}/${SMALL_FILE}_tears" < "${SMALL_FILE}"
jc18@farm3-head4:/tmp$ echo $?
0
jc18@farm3-head4:/tmp$ ils 
/seq-dev/home/jc18#Sanger1-dev:
  1FBLYX_0.json
  24174_5#888.cram
  irods_test_large_QZI
  irods_test_small_jra_tears
jc18@farm3-head4:/tmp$ ils -l irods_test_small_jra_tears
  jc18              0 root;replicate;red;red3;irods-seq-i16-fg         1024 2018-07-20.13:36 & irods_test_small_jra_tears
  jc18              1 root;replicate;green;green2;irods-seq-sr02-ddn-ra08-9-10-11         1024 2018-07-20.13:36 & irods_test_small_jra_tears
jc18@farm3-head4:/tmp$ /software/npg/20180702/bin/tears -f -r  "${irods_cwd}/${SMALL_FILE}_tears" > $(mktemp --tmpdir irods_test_XXX)
Error: rcGetHostForGet failed with status -1803000:HIERARCHY_ERROR
jc18@farm3-head4:/tmp$ /software/npg/20180702/bin/tears -f -r  "${irods_cwd}/${SMALL_FILE}_tears" > kabloom
Error: rcGetHostForGet failed with status -1803000:HIERARCHY_ERROR
jc18@farm3-head4:/tmp$ /software/npg/20180702/bin/tears -f -r  /seq-dev/home/jc18#Sanger1-dev/irods_test_small_jra_tears > kabloom
Error: rcGetHostForGet failed with status -1803000:HIERARCHY_ERROR
jc18@farm3-head4:/tmp$ /software/npg/20180702/bin/tears -f -r  /seq-dev/home/jc18#Sanger1-dev/irods_test_small_jra_tears 
Error: rcGetHostForGet failed with status -1803000:HIERARCHY_ERROR

What is interesting also is that the HIERARCHY_ERROR doesn't appear in any of the IES or IRES logs.

whitwham commented 6 years ago

Interesting, could you try again with the -d option and see if it works with the default host?

kript commented 6 years ago

Using the -d option it works;

$ /software/npg/20180702/bin/tears -v  -f -r /seq-dev/home/jc18#Sanger1-dev/irods_test_small_NFo < irods_test_small_NFo
Setting client name to: tears:1.2.4
No iRODS URI, using default settings.
host irods-sanger1-dev.internal.sanger.ac.uk
zone Sanger1-dev
user jc18
port 1247
Extra error message: 
Error: rcGetHostForGet failed with status -1803000:HIERARCHY_ERROR
$ /software/npg/20180702/bin/tears -v -d  -f -r /seq-dev/home/jc18#Sanger1-dev/irods_test_small_NFo < irods_test_small_NFo
Setting client name to: tears:1.2.4
No iRODS URI, using default settings.
host irods-sanger1-dev.internal.sanger.ac.uk
zone Sanger1-dev
user jc18
port 1247
1024 bytes read
1024 bytes written
0 bytes read
Total bytes written 1024
whitwham commented 6 years ago

So going through the default server (zone?) works. The API call rcGetHostForGet that tears uses to get the right host for reading files is coming back with an error. This sounds like a server side problem though I don't know how we go about testing that. I wonder if there is another client that uses this API call.

whitwham commented 6 years ago

I can replicate the error.

If objPath points to a file rcGetHostForGet returns an error code. So if objPath is: /seq-dev/home/jc18#Sanger1-dev/test.txt Then the returned error code will be -1803000 (HIERARCHY_ERROR).

If the objPath points instead to a collection then there is no error. So /seq-dev/home/jc18#Sanger1-dev will return an answer.

In the current (and back to 2016 when the cpp version of rcGetHostForGet was first put in the irods github) the documentation in the header says this:

param[in] dataObjInp - generic dataObj input. Relevant items are: objPath - the path of the target collection.

Which explicitly calls for an object path to a collection.

However, tears predates this version of irods and the irods-legacy documentation says this:

`\usage

Get the address of the best server to download the data object /myZone/home/john/myfile:

dataObjInp_t dataObjInp; char *outHost = NULL; bzero (&dataObjInp, sizeof (dataObjInp)); rstrcpy (dataObjInp.objPath, "/myZone/home/john/myfile", MAX_NAME_LEN); status = rcGetHostForGet (conn, &dataObjInp, &outHost); if (status < 0) { \n .... handle the error }

Which wants an actual file and is what I originally programmed against.

At some point the documentation changed but the behaviour of irods did not. It looks like it now has.

whitwham commented 5 years ago

RENCI have dealt with the issue here.