SWI-Prolog / swipl-devel

SWI-Prolog Main development repository
http://www.swi-prolog.org
Other
965 stars 176 forks source link

Saved States: dependencies for shared libs #425

Open erlanger opened 5 years ago

erlanger commented 5 years ago

Truly stand alone

In order to have multi-arch, stand-alone saved states we need to save not only the shlib loaded by load_foreign_library but also its dependencies. So for example, archive4pl.so needs libarchive.so to be in the saved state also.

The solution I am planning goes like this:

  1. Extend the qsave:arch_shlib(Arch, FileSpec, SoPath) hook to allow SoPath to be a list of libs. These libs are then saved in the state with names like: shlib('x86-64-myarch', archive, archive4pl.so) shlib('x86-64-myarch', archive, libarchive.so)

  2. At runtime, it is easy to simply unify the first and second arguments of shlib and then load all libraries that match.

I think this will work well, don't have doubts about it, but perhaps you catch something I didn't think of.

erlanger commented 5 years ago

@JanWielemaker I guess you are okay with this?

erlanger commented 5 years ago

Update: the hook needs to be qsave:arch_shlib(Arch, FileSpec, SoPath, DepsPaths) since we need to differentiate which one is the main library (we need to call the C install function on it, but that is not needed on the dependencies).

JanWielemaker commented 5 years ago

I thought we were going for the exception/3 route. That means you just come to the conclusion you failed to load e.g. foreign(archive4pl) and call the exception/3 handler to try and fix it. That can do whatever its wants, such as downloading various files and install them, and ask the shared library loader to retry. You must allow this retry to happen only a few times at max to avoid an infinite loop.

erlanger commented 5 years ago

Yes, I am using the exception/3 to provide the support for the times when the shared library is not found and we load from the network or some other especial way.

The reason I thought about not using exception for handling shlib dependencies is because it is not really an exceptional condition, since almost always we have at least two shared libraries that need to be loaded: 1) the prolog interface library (defining the predicates in C), and 2) The actual C library that provides the functionality (there are no C prolog predicates in the deps). This is not an exceptional case, since it happens for almost every case or add-on that integrates an existing C library into SWIProlog.

It seems more like a hack to handle the usual case through an exception handler. That is why I thought it should be handled by SWIProlog and not by an exceptional hook. This would mean we have stand alone states 'out of the box' in most use cases without having to provide a hook that calls open_shared_object to load the dependencies. The reason why it has not been a problem is because the saved states have not been truly stand alone, because they have required the user to pre-install the dependencies on the system, and then dlopen loads the dependencies.

What do you think we should do? P.S.: there is also another reason, if the user exception hook loads the shared library dependencies, who is going to unload them when unload_foreign_library is called?

JanWielemaker commented 5 years ago

I see. Note that for running libswipl.so you still need e.g., libgmp. For Android I guess the Java wrapper could install that if needed? Otherwise I'm a bit in doubt. You may need to do lots of things to resolve a .so file that cannot be loaded due to missing dependencies. The direction you are heading is trying to define what is needed (the dependencies). The alternative is to try and load e.g., archive4pl.so and if that fails call something (e.g. exception/3) to take care of that and retry. A sensible solution for this scenario is that LD_LIBRARY_PATH points at some location where dependencies are installed and the exception hook installs libarchive.so there and asks the loader to retry. Note that e.g., libarchive.so is loaded as a dependency of archive4pl.so and must be findable by ldopen() as a file. I doubt that preloading it from some other location helps.

(p.s. I don't care too much about unload_foreign_library. There are so many issues with that that in practice it is hard to use. I also assume that shared objects loaded as dependencies are taken care of by the OS).

erlanger commented 5 years ago

I see. Note that for running libswipl.so you still need e.g., libgmp. For Android I guess the Java wrapper could install that if needed?

The very basic libraries (to make the smallest APK) like libswipl.so and libgmp.so are bundled directly in the apk file and then copied to a usable place (which is always private for the app in android) by java. We don't want to do this with any other libraries, because the apk builder tries to leave as much as possible to prolog (Google keeps changing things and it will be a maintenance nightmare to add other libraries with java).

The alternative is to try and load e.g., archive4pl.so and if that fails call something (e.g. exception/3) to take care of that and retry. A sensible solution for this scenario is that LD_LIBRARY_PATH points at some location where dependencies are installed and the exception hook installs libarchive.so there and asks the loader to retry. Note that e.g., libarchive.so is loaded as a dependency of archive4pl.so and must be findable by ldopen() as a file.

In android the shared libraries (besides the public android NDK libraries) are application specific. You can't install --let's say libarchive.so-- and then share it with every other app, it needs to be installed within the filespace of the app itself. To solve this I have two options (see a third option at the end which is probably the best):

  1. The system does it 1.1. qsave_program asks qsave:arch_shlib for the names of the deps and bundles them in the saved state. 1.2. At runtime, load_foreign_library loads any deps from the saved state, and then it loads the main library calling the install C function.

  2. user:exception hook 2.1 The APK builder bundles libarchive.so (e.g. all .so deps) in the prolog saved state file 2.2. At runtime, user:exception puts libarchive.so in a usable place in the app filespace (with LD_LIBRARY_PATH) pointing to it, and sets action to retry.

    I doubt that preloading it from some other location helps.

Preloading from another location works, as long as the DT_SONAME is set to the proper name, and global symbols are enabled when loading ( we will most likely have to do this with user:exception or any other method).

The reason why I was leaning towards 1 was because qsave_program already bundles shared libraries in the state, and shlib already loads them, so why repeat this code in the APK builder? Perhaps mid way solution could be to do the following:

  1. mid-way solution 3.1. qsave_program bundles the so dependencies in the saved state (as returned by the arch_slib(Arch, FileSpec, MainFile, Deps) hook, and 3.2. The user is responsible for placing the deps in the right place using user:exception, calling retry.

I am leaning towards this third option now, since the user:exception code will handle the network loading case in addition to the pre-bundled deps in the saved state.

What do you think?

JanWielemaker commented 5 years ago

(3) was what I more or less had in mind as being the most flexible. Note that (3) doesn't require a bundle plugin.so, the hook may also download that. It seems you have all the pieces of the puzzle in your mind and you are getting used to how things work in SWI-Prolog. So, just try what you think is best and backtrack if it proves wrong after all. That is also how I solve most of these problems ...

Do you plan on some standard security measures for verifying the downloaded .so, such as an SHA key?

erlanger commented 5 years ago

Great, I'll get to work on it.

As for the download verification (SHA, etc), surely we will have something like this. I didn't get there yet since it is very easy to add once I get the other pieces working.

(3) doesn't require a bundle plugin.so, the hook may also download that.

Right, this is the best, more flexible solution.

erlanger commented 5 years ago

BTW, I presume multifile predicates are tried in the order of which the files were loaded? I don't need this, but just seeking to have solid knowledge about the execution model for multifile preds.

JanWielemaker commented 5 years ago

Yes, but it is in general poor design to rely on the order. Reloading files during development may swap the order. If order matters, consider a priority or dependency.