cloudfoundry / buildpack-packager

Buildpack Packager
Apache License 2.0
23 stars 31 forks source link

Allow 'symlinks' array in order to create link to deps #1

Closed Soulou closed 9 years ago

Soulou commented 10 years ago

Use case:

Heroku is using the environment variable "STACK" in its buildpacks

(i.e. https://github.com/cloudfoundry/ruby-buildpack/blob/master/lib/language_pack/ruby.rb#L43-L45)

But quite a lot of files are common to different stacks and instead of downloading a file several times, it could be interesting to do it once, then create links.

Example:

symlinks = (
  'https___s3-external-1.amazonaws.com_heroku-buildpack-ruby_ruby-2.1.0.tgz:https___s3-external-1.amazonaws.com_heroku-buildpack-ruby_stackname_ruby-2.1.0.tgz'
  'https___s3-external-1.amazonaws.com_heroku-buildpack-ruby_ruby-2.1.1.tgz:https___s3-external-1.amazonaws.com_heroku-buildpack-ruby_stackname_ruby-2.1.1.tgz'
  'https___s3-external-1.amazonaws.com_heroku-buildpack-ruby_ruby-2.1.2.tgz:https___s3-external-1.amazonaws.com_heroku-buildpack-ruby_stackname_ruby-2.1.2.tgz'
)
cfdreddbot commented 10 years ago

Hey Soulou!

Thanks for submitting this pull request!

All pull request authors must have a Contributor License Agreement (CLA) on-file with us. Please sign the appropriate CLA (individual or corporate).

When sending signed CLA please provide your github username in case of individual CLA or the list of github usernames that can make pull requests on behalf of your organization.

cf-gitbot commented 10 years ago

We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: https://www.pivotaltracker.com/story/show/80193260.

jchesterpivotal commented 9 years ago

Hi @Soulou, I'm not sure if this would work. Heroku are starting to build runtimes against different base OS images, and using the STACK variable to build URLs that distinguish between them.

When the STACK isn't interpolated into a URL, the assumed stack is "cedar". Until now the ruby-buildpack bin/package has used the non-interpolated URLs.

As of ruby 2.1.3, it looks as though non-interpolated URLs are no longer being added -- we must either say "cedar" or "cedar-14", depending on the base image, for versions at 2.1.3 and above. For now we are shipping the "cedar" binaries, because that most closely matches the Cloud Foundry base images. If you look at the bin/package in 1.1.2, you can see how 2.1.3 includes "cedar" in the path.

Soulou commented 9 years ago

Actually, I'm building something which uses the stacks but which are not 'cedar' or 'cedar-14'.

So I want to be able to choose the stack name I wish and using heroku pre-build binaries, as a result symlinking works fine.

I've already deployed this PR and it works as intended. There may be more beautiful ways to achieve this operation, I'd be glad to hear them =)

jchesterpivotal commented 9 years ago

Hi @Soulou, thanks for your patience so far.

We've spent some time today going over this PR again and we have some questions / remarks.

First, can you expand on your usecase some more? You say you want to choose the stack name. Are you setting your STACK environment variable to 'cedar', 'cedar-14' or some other value? As you may know, Heroku are progressively introducing changes related to that variable and we are periodically introducing logic to correctly adjust our environment to allow Heroku's code to operate without modification.

Second, we believe your approach might break as of Ruby 2.1.3. If you look in bin/package, Heroku now force requests for a ruby binary to include either /cedar/ or /cedar-14/ in the URL, according to which version you're using. So on-disk, the ruby 2.1.3 binary already includes cedar in the position where you'd like to symlink stackname. This introduced a regression to the Ruby buildpack 1.1.2 which we fixed in release 1.1.3. Your example goes up to 2.1.2, which still allowed "un-STACKed" URLs.

Thirdly, the buildpack-packager code is shared amongst multiple buildpacks. Heroku's various buildpacks interpret the STACK variable in different ways. So far the Ruby buildpack has had the most changes, and the most intensive changes, that we've so far observed and patched for. However all the other buildpacks we are downstream from increasingly modify their behavior according to STACK. I'm reluctant to introduce a new dimension into the mix of interacting code that will require additional fitting to work with the different buildpacks.

Finally, I don't think we need this functionality ourselves. Our current approach is that each buildpack is sufficiently different that the modifications are specific per-buildpack.

If you can tell us more about your situation, we might be able to understand whether we should change our approach.

Cheers,

JC.

Soulou commented 9 years ago

Hello @jchesterpivotal

Thank you for your answer, I'm glad you took time to do it.

I'm completely aware of the changes heroku is doing in their buildpacks logics by using the STACK environment variable, and actually we want to use it to our advantage (we are a PaaS based on buildpacks too)

Our stack is not the same as heroku, Heroku's vendored binaries are not always compatible with our software stack and we may need to build our own package. Furthermore, we don't want to be stucked to Heroku's stack names 'cedar' and 'cedar-14' for branding reason, they will evolve in a different rythm and with different features.

As a result this feature allow us to be a bit more "independant". We want to use Heroku's binaries, but we may have to mix them with our own according to our software stack.

You can see I've forked and updated the buildpack accordingly to my needs:

They may be too specific to be included in your mainstream buildpack, but it solves the problem we have.

Regards,

Léo.

jchesterpivotal commented 9 years ago

I understand where you're coming from. For now we're concerned that this introduces another point of difference where our code can trigger unwanted codepaths in the Heroku BP code.

The tricky part of wrapping Heroku, as you'd know, is that their code can safely make two assumptions:

  1. It's always connected to the internet, and
  2. It's running against Heroku's environment.

We aim to wrap their code so that it can behave as if these things are still true.

Because we have no insight into Heroku's integration / acceptance testing mechanisms, we are frequently reduced to "reading the tea leaves" to deduce what their intention was and how we can ensure their code will run unmodified in our environment. This causes a lot of pain. One of the ways we mitigate risk is to make the minimal possible intervention for any given problem.

The symlinking approach is a more aggressive change than we have an appetite for right now. (It bugs me as a Pivot to be afraid of aggressive changes).

So I'm going to reluctantly reject this PR. If we turn out to be wrong, I'll be happy to revisit it in future.

Cheers,

JC.