EUDAT-B2STAGE / B2STAGE-GridFTP

B2STAGE service core code for EUDAT project: iRODS-DSI
14 stars 15 forks source link

iRODS 4 compatibility #8

Closed vladimir-mencl-eresearch closed 9 years ago

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

These are the iRODS 4 fixes discussed in #7.

This version compiles both with 3.3.1 and 4.1(DEV).

On my test system, it needs iRODS 3.3.1 manually tweaked with -fPIC.

For iRODS 4, I had to manually tweak libjansson to compile with -fPIC (mimicking as if https://github.com/irods/irods/pull/2623 was already merged).

Please let me know if it's working for you all fine.

Cheers, Vlad

PS: I also included a commit to bring up the version string in CMakeList.txt in sync with the tagged version. If you make another release, please remember to update that file too.

PPS: I overlooked your iRODS 4 branch before I started this work, sorry about the extra work. The find packages refactoring looks interesting - hope it won't be too much hassle to fit onto the changes I made to CMakeList.txt...

muccix commented 9 years ago

Hi Vlad,

thanks for the pull request.

Unfortunately the CMakeLists.txt doesn't work me as it is, since some path are different. For example all the file that you find in "$ENV{IRODS_PATH}/lib/irods/externals" are in "$ENV{IRODS_PATH}/lib/irods" in my case (apart from "libboost_chrono.a" which is under "/usr/lib/x86_64-linux-gnu/"); "libRodsAPIs.a" is under "/usr/lib/irods/".

The same for "globus_config.h": in my env it is under $ENV{GLOBUS_LOCATION}/include/x86_64-linux-gnu/globus".

For the moment, I will correct the paths in the CMakeLists.txt: than I will try to understand if this issue is manageable through the Find_packages

Cheers, Robert

muccix commented 9 years ago

I have managed to compile the DSI (I had to make some modifications in the CMakeLists.txt because the std:: libraries gave me error in the linking phase), however I'm now facing a strange problem that is driving me crazy.

When I start the GridFTP server i receive the following error:

Starting globus-gridftp-serverServer configuration error. Couldn't load 'iRODS'. globus_i_gfs_data.c:globus_i_gfs_data_new_dsi:2361: DSI activation failed. globus_extension.c:globus_l_extension_dlopen:436: Couldn't dlopen libglobus_gridftp_server_iRODS_gcc64pthr.so in /usr/lib/x86_64-linux-gnu (or LD_LIBRARY_PATH): file not found

First of all I can not understand why it requires a DSI library containing the flavor in the name since the GridFTP is installed from packages; secondly, the file "libglobus_gridftp_server_iRODS_gcc64pthr" is present in "/usr/lib/x86_64-linux-gnu".

I'm working on a Ubuntu 14.04 VM. Have you ever happened to face a problem like this?

Cheers, Roberto

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

I've just started having a look.

I was doing my work on a CentOS6 system, but my workstation itself is Ubuntu (12.04), so I've had a look here too.

I see iRODS 4 build script behave depending on where you run them - on RedHat-based systems, they create RPMs, on Debian-based systems / Ubuntu, they create DEB files.

But the paths end up being the same.

I see the difference is in the iRODS version - I was testing the CMakeList.txt changes against iRODS 4.1 (i.e., HEAD of github.com/irods/irods).

And these RPMs (as well as DEBs) put the files into /usr/lib/irods/externals

I think we might go with supporting only iRODS 4.1+ (and 3.x), but not 4.0...

Or you might try some magic with findPackages.

Regarding 'globus_config.h': I'm puzzled over your setup: that's an RPM based setup or Globus from source? Which version?

We should still be able to add that ( $ENV{GLOBUS_LOCATION}/include/x86_64-linux-gnu/globus") to the list of directories searched ofr include files - just need to find a reliable way of identifying the platform name in the directory.

Finally, regarding the dlopen error: what is the exact line in your GridFTP server config?

It could be somehow driven by the way the library is named in the config.

Or it could be a missing symlink from the unversioned .so to the versioned .so.nnnn

Please let me know which of the problems are now resolved.... :-)

Cheers, Vlad

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

Looking into a few more things:

I'll add to the pull request if I get this working - just dropping as a note here so that we have it recorded...

Cheers, Vlad

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

While working on the above, I randomly triggered the dlopen error you got.

I think the explanation is:

Just thought this might explain it - I'll continue digging actually trying to get the module work with iRODS 4.

Cheers, Vlad

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

I've just tried getting this going with iRODS 4 - and I'm getting stuck.

At this point, rcConnect keeps failing, trying to load an internal iRODS module (/var/lib/irods/plugins/network/libtcp.so), but breaking on unresolved symbols (which should be provided by the iRODS library already loaded). Not sure why the dlopen call is breaking - I'll post about this to irod-chat and see if I get anything.

On other fronts: I've also tried compiling against iRODS 4.0.3 release (instead of 4.1 HEAD) and I see what issues you were referring to.

Unfortunately, iRODS 4.1 significantly reorganizes the libraries, so we would either have to have a custom section for 4.0.3 and 4.1, or support just one of them.

My understanding of the differences is:

This could still be worked around with separate definitions of irods_link_obj_path depending on the iRODS version...

Another difference is that iRODS 4.1 switches the client from using a ~/.irods/.irodsEnv file to using ~/.irods/irods_environment.json - in JSON format.

My way of getting setting the connection parameters was:

{
    "irods_port": 1247,
    "irods_host": "gridgwtest.canterbury.ac.nz",
    "irods_authentication_scheme": "GSI",
    "irods_user_name": "rods",
    "irods_zone": "BeSTGRID-DEV"
}

OK, so that's where I'm leaving it now: iRODS not connecting because of missing symbols, and CMakeList.txt needs differences between iRODS 4.0.3 and 4.1.

If you like the proposed solution, I can add another commit tomorrow to tweak the link libraries between 4.0.3 and 4.1 - and I hope I'd receive some reply/hint based on my irod-chat post...

Cheers, Vlad

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

I've just tried hard to make the code work with iRODS 4.x.

I in the end managed to get it going in a heavily hacked setup, but we would need the iRODS developers to fix how they build the binaries and plugins before we can make this work with an out-of-the-box iRODS 4.x server.

Main reason is they use plugins (implemented as .so libraries) which depend on symbols from the core iRODS runtime, without declaring a dependency on that runtime.

When a program linked against iRODS runtime (as static objects) load the plugin, it links all fine.

When our library tries linked against iRODS tries loading the plugin, the symbols are not found, because they're not visible in the main program - and the linker won't look into our library loaded with dlopen.

This can be worked-around by recompiling the iRODS plugins to link against all of the required libraries, but that means recompiling iRODS from source and hacking the setup. And it also results into the runtime code being loaded into memory multiple times.

I've added comments to the issue where they already recorded they should do this for server code: https://github.com/irods/irods/issues/2308

But right now, there's not much we can do from our end.

I've added two more commits that should make our code compile with iRODS 3.x, 4.0.x and 4.1.x - each with a separate configuration. The code first checks if it's iRODS 4.x (IRODS_PATH is "/usr"), then it checks whether IRODS_40_COMPAT is set to use 4.0.x compatibility mode, otherwise it goes for 4.1.x.

This makes our code compile - but we are far from being able to run it.

Should we merge this and wait for iRODS developers to fix their linking?

Cheers, Vlad

vladimir-mencl-eresearch commented 9 years ago

PS: I've run one more time into the situation where Globus GridFTP server gave me the error message that it could not find the library with the flavoured name.

And it was because it first tried loading it without the flavoured name, but loading failed due to unresolved dependencies. So the fix for that is to fix the dependencies.

And one more lesson learnt: the dynamic linker won't be searching LD_LIBRARY_PATH set from within gridftp.conf, it has to be set before starting Globus GridFTP server in order to have influence on searching for dependencies of libraries loaded with dlopen().

muccix commented 9 years ago

Hi Vlad, thank you very much for you commitment on this.

If the DSI still work with iRODS 3, I will merge the pull request and than wait for iRODS developers to fix the linking issues.

Cheers, Roberto

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

I think we can consider all of this fully resolved now.

I managed to hack around the iRODS 4 library issued by using LD_PRELOAD to load the DSI module into the symbol namespace of globus-gridftp-server itself.

I've added that to the documentation (and then fiddled with the markup in the documentation to format logically nested lists as nested lists). Also added some clarification - hope you are happy with the changes.

I think it's now ready to merge - iRODS 3.x code is really untouched - this pull request is mainly about the cmake configuration for iRODS 4 + updated documentation. (The only change to the C source code is to make it flexible for iRODS 3 vs. 4, but for 3 it should work out semantically identical. And tests all fine for me).

I look forward to hearing from you.

Cheers, Vlad

muccix commented 9 years ago

Hi Vlad,

great job! I'm going to test it next week and give you a feedback.

Thanks a lot Roberto

muccix commented 9 years ago

Hi Vlad,

sorry to bother you again.

I'm testing the new version with iRODS 4.03. The DSI compiles and the GridFTP starts correctly but, when I try yo use it, on the GridFTP logs i see:


[29494] Wed Apr 29 17:05:34 2015 :: Server configuration error. Couldn't load 'iRODS'. DSI activation failed. globus_extension_module: Couldn't dlopen libglobus_gridftp_server_iRODS_gcc64pthr.so in /usr/lib64 (or LD_LIBRARY_PATH): file not found

Wed Apr 29 17:05:34 2015 :: Child process 29494 ended with rc = 1


I have uncommented the line with IRODS_40_COMPAT in the setup.sh and I export LD_LIBRARY_PATH before starting the GridFTP, pointing to the folder where the libglobus_gridftp_server_iRODS.so is placed. Could it be due to unresolved dependencies?

I can I find it out?

Thanks again Roberto

vladimir-mencl-eresearch commented 9 years ago

Hi Roberto,

My interpretation of this error message is that loading of libglobus_gridftp_server_iRODS.so failed - most likely due to unresolved dependencies. (Possibly could be also other issues like filesystem permissions - but let's focus on dependencies for now).

Try running

ldd /path/to/libglobus_gridftp_server_iRODS.so

with the same environment (primarily LD_LIBRARY_PATH) as when starting your GridFTP server.

If it's dependencies, you should see an error message about that (=> not found in ldd output). Maybe you'll have to add /usr/lib/irods to LD_LIBRARY_PATH ... let's see what that output shows.

Cheers, Vlad

PS: In order for iRODS 4 to work (succeed in loading plugins), you'll also need to preload this library with LD_PRELOAD into the GridFTP server... as per my additions to README.md

muccix commented 9 years ago

Hi Vlad,

everything is working fine now.

I just needed to add the path to the DSI library to the LD_LIBRARY_PATH (actually I did it, but probably I didn't restart properly the GridFTP).

Thanks a lot, Roberto

vladimir-mencl-eresearch commented 9 years ago

Thanks for the confirm, glad to hear it worked!