irods / irods_client_globus_connector

The iRODS Globus Connector
2 stars 4 forks source link

Transfer of a file with an apostrophe in the file name fails with a HIERARCHY_ERROR error message #101

Open ingridbr opened 3 weeks ago

ingridbr commented 3 weeks ago

Bug Report

iRODS and Globus, OS and Version

iRODS server 4.3.1 almalinux9 Globus Connect server version: globus-connect-server, package 5.4.67, cli 1.0.46 Globus irods connector: irods-gridftp-client-4.3.1.0-16.x86_64

What did you try to do?

I tried to transfer a file that contains an apostroph in the file name from a Globus Personal connect endpoint to an irods globus endpoint using the Globus graphical interface

image

Expected behavior

The file gets uploaded without errors

Observed behavior (including steps to reproduce, if applicable)

When transferring a file names "Ingrid's test file.txt" the transfer fails with the following error:

Error (transfer) Endpoint: VSC iRODS gbiomed.irods.icts.kuleuven.be (e1074609-2279-4c5c-95a2-8fcfbb44ae5e) Server: 134.58.8.4:443 File: /gbiomed/home/u0089722/test/Ingrid%27s%20test%20file.txt Command: STOR /gbiomed/home/u0089722/test/Ingrid's test file.txt Message: Fatal FTP response

Details: 500-Command failed. : iRODS: Error: rcDataObjOpen failed opening '/gbiomed/home/u0089722/test/Ingrid's test file.txt'\r\n500-. HIERARCHY_ERROR: , status: -1803000.\r\n500-\r\n500 End.\r\n

The same file can be transferred without errors to a Globus endpoint on a POSIX system. An iput of the same file with the name quoted or with the apostrophe escaped also works and the file is uploaded to irods:

$ iput "Ingrid's test file.txt" $ ils /zone/home/user/test: Ingrid's test file.txt

alanking commented 3 weeks ago

Possibly related to https://github.com/irods/irods/issues/3902

JustinKyleJames commented 1 week ago

I will debug this and see if it is in fact the same as 3902.

ingridbr commented 1 week ago

Thank you Justin!

korydraughn commented 1 week ago

@JustinKyleJames If you confirm it's irods/irods#3902, try compiling https://github.com/irods/irods/pull/7819 and seeing if it resolves the issue. Even with that PR, you may have to tweak the logic in the globus connector slightly. It shouldn't require anything more than replacing embedded single quotes with their hex encoding.

JustinKyleJames commented 1 week ago

I was able to upload and download a test with an apostrophe using globus-url-copy.

root@cd28fc7627dd:/# globus-url-copy justin\'s\ test.txt gsiftp://$(hostname):2811/tempZone/home/rods/justin\'s\ test.txt
root@cd28fc7627dd:/# ils
/tempZone/home/rods:
  justin's test.txt
root@cd28fc7627dd:/# globus-url-copy gsiftp://$(hostname):2811/tempZone/home/rods/justin\'s\ test.txt justin\'s\ test.txt.2
root@cd28fc7627dd:/# ls -l justin\'s\ test.txt.2 
-rw-r--r-- 1 root root 15 Jun 27 17:25 "justin's test.txt.2"

I'm wondering if this has to do with checksum processing?

JustinKyleJames commented 1 week ago

Adding to this. The transfer using ftp did not work but instead gave the open error.

root@cd28fc7627dd:/# ftp localhost 2811
Trying 127.0.0.1:2811 ...
Connected to localhost.
220 cd28fc7627dd GridFTP Server 16.2 (gcc64, 1701980266-86) [Globus Toolkit 1638371632] ready.
Name (localhost:root): user1
331 Password required for user1.
Password:
230 User user1 logged in.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> put justin's\ test.txt
local: justin's test.txt remote: justin's test.txt
229 Entering Passive Mode (|||41455|)
150 Beginning transfer.
100% |******************************************************************************************************************************************************************************************************************|    15      218.63 KiB/s    00:00 ETA500-Command failed. : iRODS: Error: rcDataObjOpen failed opening '/tempZone/home/user1/justin's test.txt'
500-. HIERARCHY_ERROR: , status: -1803000.
500-
500 End.
15 bytes sent in 00:00 (0.03 KiB/s)
JustinKyleJames commented 1 week ago

I tested this with iRODS built https://github.com/irods/irods/pull/7819 (also rebuilt the globus plugin with the iRODS dev package from this pull) and it worked.

I'm still not sure why is always worked with globus-url-copy but not with ftp but it definitely appears to be related to https://github.com/irods/irods/issues/3902.

JustinKyleJames commented 1 week ago

I figured out what the difference is between the two scenarios.

In the globus-url-copy scenario, the transfer_info->alloc_size is being sent. Because of this and because the file size is small, it was only doing one open/write/close.

In the ftp scenario, this transfer_info->alloc_size is not provided so we use the number of threads from the configuration. This means that after the initial open there are other opens that use the replica token. This must trigger a code path that uses genQuery and this error.

Also note that if I set $numberOfIrodsReadWriteThreads 1 in /etc/gridftp.conf, the error does not occur when using FTP.

@ingridbr Would it be possible for you confirm this workaround on your side?

JustinKyleJames commented 1 week ago

One possible work-around that is less invasive than the one above is to force only one thread if we encounter an apostrophe in the filename. It is a hack but the target is very limited. Thoughts?

trel commented 1 week ago

I think a workaround is fine - so it works.

and then make a new issue pointing back here for the 'real' fix later so we don't forget.