Closed mstfdkmn closed 6 months ago
I don't know anything about Globus, but I'll add that the error message appearing in the log is coming from these lines: https://github.com/irods/irods/blob/f6eb6c72786288878706e2562a370b91b7d0802e/server/api/src/rsDataObjOpen.cpp#L767-L783
Something happened when attempting to resolve a resource hierarchy for the given operation. It seems like the Globus connector may not be using the right open flags or something for certain situations. Probably warrants investigation.
Hi @JustinKyleJames, would it be possible to prioritize this? Because we do a lot of transfers to irods via the connector and due to this hard to see a clean irods log - (any file transfer to irods throws that error). Thanks.
Yes, we'll put eyes on it.
I have no been able to reproduce this problem using globus-url-copy nor with ftp. However, I think I know the root of the problem.
When the plugin's globus_l_gfs_iRODS_recv
method gets called the open flag is first set to O_WRONLY. Then if the truncate option is set the flags for O_CREAT and O_TRUNC are set.
With the ways I know to test transfers, the truncate flag is always set so the O_CREAT flag is always set. I am guessing that when this error occurs the truncate flag is not set so O_CREAT does not get set. If the file does not exist we get an error.
I did verify that the error is generated if I update the code to not set the O_CREAT.
I think the solution is to set the open mode for the first thread to both O_CREAT and O_WRONLY. I may have to get @mstfdkmn to help test this.
@mstfdkmn Are you open to testing the solution in PR #99 (in a testing environment)?
Yes, we are thinking to test it soon (we don't have a test endpoint for the globus connector so might require time for possible challenges).
No problem. Let us know how it goes.
@mstfdkmn a fix has been merged. closing so we can get this into the next release.
if it's a problem still / again, please open a new issue and reference this one.
Great! Thanks. We definitely let you know if needed.
Seems this fix didnt resolve the issue. We built the connector from its source and integrated to our production end-points and I should say that we do still see errors mentioned above in our irods logs.
Sorry was an early alarm! I tested against different zones' endpoints and I dont see any error anymore. My initial test transfer seem to have coincidentally resulted in at the same time with an ongoing transfer (I guess old processes are not refreshed/cleaned when the connector built version is changed) and I guess that is why I supposed that I saw again the same errors.
So you're saying the change seems to have resolved the issue?
Yes. But what I dont understand is something else I guess: that is, I do still see the same errors for some transfers that I didn't initiate and I am guessing these are old transfers. If I know more, I will let you know.
Bug Report
iRODS Version, OS and Version
4.3.0 almalinux8
What did you try to do?
Expected behavior
Transfers to be completed successfully and and without seeing any error in iRODS logs
Observed behavior (including steps to reproduce, if applicable)
Transfers are completed successfully (data objects have
&
status in irods) but with an error in the logs of iRODS[2023-06-01T14:25:26.454Z][icts-p-cloud-rdm-hev-2] irods stdout | {"log_category":"legacy","log_facility":"local0","log_level":"error","log_message":"[rsDataObjOpen_impl:904] - [OBJ_PATH_DOES_NOT_EXIST: Data object or replica does not exist [error_code=-808000, path=/ghum/home/u0137480/my_test_file.txt].\n\n] [error_code=[-358000], path=[/ghum/home/u0137480/my_test_file.txt], hierarchy=[]","request_api_name":"DATA_OBJ_OPEN_AN","request_api_number":602,"request_api_version":"d","request_client_user":"u0137480","request_host":"127.0.0.1","request_proxy_user":"globus","request_release_version":"rods4.3.0","server_host":"ghum.irods.icts.kuleuven.be","server_pid":167725,"server_timestamp":"2023-06-01T14:25:26.454Z","server_type":"agent"}
If I interpret it correctly, on each transfer we see errors because of this.
Could you let us know whether this is something that we should be worried about?