IBMStreams / administration

Umbrella project for the IBMStreams organization. This project will be used for the management of the individual projects within the IBMStreams organization.
Other
19 stars 10 forks source link

Proposal to solve problem with big toolkits causing all toolkit libs (needed or not) to become part of an sab. #101

Closed hleuschner closed 7 years ago

hleuschner commented 7 years ago

@chanskw, @mikespicer From Samantha: With a big toolkit, and all its dependencies, it causes the application bundle to get really big when we build the application. In some cases, customers only use a small subset of the functions from the toolkit, but we are forced to carry all the dependencies and their jar files in the bundle. This will become more problematic in the bluemix environment as we try to submit these bundles.

From discussion with Jörg (@joergboe), Michael (@m-kotowski) and Mark ( @markheger): There might be a better solution than creation of smaller, more specific toolkits instead of big ones. Streams should leverage known dependencies and only bundle libs that are really needed by used toolkit operators/functions - not always all libs that come with a toolkit.

m-kotowski commented 7 years ago

The issue is probably also valid for operators having dependencies to dynamic link libraries (.so) that are part of the toolkit.

As described here, you can specify per toolkit, which toolkit files and folders shall be included in the sab file (<sabFiles>), in addition to the files and folders being included by default.

On operator level, you can also specify dependencies, for example, using the @Libraries Java annotation or the <libraryDependencies> XML element in the operator XML. Too minimize the size of sab files, the sab bundler could utilize these operator-specified dependencies. But, this is something, which can be implemented in the IBM Streams product only.

This issue raises another question: Should a toolkit contain dependent libraries (.jar or .so), or should it specify the dependencies only? See the DPS toolkit's issue 16 for a case, where having dependent libraries in a toolkit results in problems.

From my point of view, if a dependency exists and if the dependent library belongs to a package that can be downloaded and installed separately, I would prefer to go this way instead of having a bundled copy in the toolkit. This approach requires more admin actions before you can use a toolkit, but it seems also to reduce/remove the size and clash problems.

mikespicer commented 7 years ago

FYI, we are also looking at being able to register toolkits with a domain or instance. Many details to be worked out, but the basic idea is that we would allow users to specify which dependencies should be included in the bundle and which should be resolved at submission time (with a failure if they cannot be resolved).

chanskw commented 7 years ago

I am going to close this as I do not think this can be resolved in the Github IBMStreams organization.