HashNuke / heroku-buildpack-elixir

Heroku Buildpack for Elixir with nitro boost
MIT License
810 stars 304 forks source link

Buildpack incompatible with building in /app #194

Closed edmorley closed 3 years ago

edmorley commented 3 years ago

Hi

I'm on the team that maintains Heroku's build system and official buildpacks, and wanted to let you know about a future incompatibility with this buildpack.

The directory in which the Heroku build system performs builds is currently a path like/tmp/build_<hash>.

In the near future this path will be changing to /app so that the build-time and run-time app locations are the same path - in order to resolve a number of long standing bugs and reduce the number of hacks buildpacks have to use to work around non-relocatable languages/toolchains. As part of this change, HOME will also be changing from /app to a PATH under /tmp (otherwise temporary files/secrets saved to HOME would end up in the slug).

Trying out the above change with this buildpack (using this sample app and heroku labs:enable build-in-app-dir) revealed that builds with this buildpack will break - or more accurately, output an error to the build log, however the build be incorrectly marked as successful and instead the app give a boot time error.

Build time error:

remote: -----> Building on the Heroku-20 stack
remote: -----> Performing build in /app since 'build-in-app-dir' is enabled
remote: -----> Elixir app detected
remote: -----> Will export the following config vars:
remote:        * MIX_ENV=prod
remote: -----> Checking Erlang and Elixir versions
remote:        WARNING: elixir_buildpack.config wasn't found in the app
remote:        Using default config from Elixir buildpack
remote:
remote:        IMPORTANT: The default elixir_version will be removed on 2021-06-01. Please explicitly set an elixir_version in your elixir_buildpack.config before then or your deploys will fail.
remote:
remote:        Will use the following versions:
remote:        * Stack heroku-20
remote:        * Erlang 20.1
remote:        * Elixir v1.5.3
remote: -----> Using cached Erlang 20.1
remote: -----> Installing Erlang 20.1
remote:
remote: cp: will not create hard link '/app/.platform_tools/erlang/erlang/erlang' to directory '/app/.platform_tools/erlang/erlang'
remote: -----> Using cached Elixir v1.5.3
...

Then at boot:

2021-03-14T18:56:25.000000+00:00 app[api]: Build succeeded
2021-03-14T18:56:26.039686+00:00 app[web.1]: /app/.platform_tools/elixir/bin/elixir: 127: exec: erl: not found
2021-03-14T18:56:26.117294+00:00 heroku[web.1]: Process exited with status 127
2021-03-14T18:56:26.191815+00:00 heroku[web.1]: State changed from starting to crashed

It appears there are two issues:

  1. Errors during the build don't cause the build to fail (not using Bash exit on error)
  2. The file removals, hardlink and copy here need to take into account that the BUILD_DIR might be /app: https://github.com/HashNuke/heroku-buildpack-elixir/blob/b9092d471ebb0163dbf1bf733b719cf1e31c8f64/lib/erlang_funcs.sh#L29-L38

We're hoping to enable the new build behaviour very soon - and whilst there will be an opt-out labs, it would be great to get this resolved before then to prevent Erlang apps from failing to boot after the first build, for users that miss the Heroku changelog entry etc.

Many thanks :-)

edmorley commented 3 years ago

It appears there are two issues:

1. Errors during the build don't cause the build to fail (not using Bash exit on error)

I've split this part out to #195.

edmorley commented 3 years ago

Annotating the commands in install_erlang() with the actual paths, gives:

  ln -s $(erlang_build_path) $(runtime_erlang_path)
  # --> `ln -s /tmp/tmp.123/erlang /app/.platform_tools/erlang`
  $(erlang_build_path)/Install -minimal $(runtime_erlang_path)
  # --> `/tmp/tmp.123/erlang/Install -minimal /app/.platform_tools/erlang`

  cp -R $(erlang_build_path) $(erlang_path)
  # --> `cp -R /tmp/tmp.123/erlang $BUILD_DIR/.platform_tools/erlang`

This is a problem when $BUILD_DIR is /app, since the cp -R tries to overwrite the symlink with the copied files.

It seems that this could be solved by either:

  1. Removing the symlink prior to the cp -R iff erlang_path == runtime_erlang_path
  2. Skipping the whole copy dance (and installing directly into /app) iff erlang_path == runtime_erlang_path
edmorley commented 3 years ago

@HashNuke @KazW Hi! Sorry to directly @ mention, but I'm keen to make sure that this buildpack does not break when we enable this new build behaviour on Heorku. As-is the app will crash at boot once the app has its next build after the change. We can temporarily exclude apps using this buildpack from the change, but long term this will start to cause issues, since other buildpacks will start to depend on the new behaviour.

KazW commented 3 years ago

@edmorley No problem, thanks for letting me know! Probably best to ping @jesseshieh as well.

As luck would have it, I was planning to rewrite my fork of this buildpack (elixir-buildpack/heroku-buildpack) in Ruby soon since I've recently had more free time open up for me.

I'll start the rewrite this weekend and take this into consideration when implementing it.

edmorley commented 3 years ago

@KazW Grerat - thank you :-)

@jesseshieh @HashNuke - Do you know when you might be able to look at this?

HashNuke commented 3 years ago

I'll take a look sometime next week. Just getting back to opensource. @edmorley Thank you very much for posting details.

travismorrison commented 3 years ago

Hi

we use this buildpack. Now Heroku CI is unexpectedly failing to build elixir, with this error:

cp: cannot copy a directory, '/tmp/tmp.7zm6qLyqTL/erlang', into itself, '/app/.platform_tools/erlang/erlang'

Our last successful build was Friday.

Do you suppose this issue, and the resulting PR, speak to what we are seeing?

Glad to provide more information. I thought I would go light in my first inquiry, since it seemed very adjacent. Also glad to wait until the PR is merged and give it a whirl then, if you think it will be merged soon.

Thanks for all your work on this.

HashNuke commented 3 years ago

@fivetanley Thank you for sending the PR. I've merged it. @travismorrison Please do try out the latest buildpack from master branch and let us know. I'll publish a new version on Heroku once you confirm.

HashNuke commented 3 years ago

@edmorley Unfortunately didn't find the time to do this myself. But I've merged @fivetanley's PR. I see your comments on it too. The PR was LGTM please do let me know if they cause issues and I'll revert.

edmorley commented 3 years ago

Please could this be reopened now that #200 has been reverted? :-)

HashNuke commented 3 years ago

Reopening issue.

travismorrison commented 3 years ago

@HashNuke did not seem to resolve the issue. We went on to lock our CI and deployment at this commit and are able to move forward with CI and deploy.

KazW commented 3 years ago

@travismorrison If you'd like to test something that's experimental I have the rewrite that I mentioned earlier in this issue: https://github.com/KazW/heroku-buildpack/tree/ruby_rewrite

It works with a few apps I have, but will break if you used the phoenix asset buildpack; I haven't had a chance to fully test it out using hatchet yet, so it may have unexpected bugs.

If you need asset support with the phoenix buildpack, this branch should work: https://github.com/KazW/heroku-buildpack/tree/ruby_rewrite_compatability