buildpacks / spec

Specification for Cloud Native Buildpacks
https://buildpacks.io
Apache License 2.0
250 stars 69 forks source link

Add a way for Buildpacks to enable verbose output #400

Open jjbustamante opened 3 months ago

jjbustamante commented 3 months ago

Discussion originated in the following thread

I think right now each Buildpack Author is responsible for exposing some kind of environment variable to enable a debug output when the buildpack is executed. Platforms like pack has a --verbose flag to enable some debug logging internally, but it will be valuable to define how to passthrough this configuration and buildpacks can read it and modify its behavior accordingly without having to expose extra variables.

dmikusa commented 3 months ago

Thanks for raising this up. This is what was done with libcnb which you mentioned.

https://github.com/buildpacks/libcnb/pull/69

It would be nice if there's something the platform could set to pass down and libcnb could include that in its checks for enabling verbose output.

schneems commented 2 months ago

I've got a "yes and" for this issue.

My experience with debugging builds on Heroku is that usually you only know you need the verbose output AFTER the build has happened. For some applications this can be a 20-30 minute debug cycle, just to enable one flag. Even for reasonably "fast" applications build times for a Ruby on Rails app is in the 2~5 minute range which is still a long time to wait for debug information.

With that in mind, my ideal would be to allow the buildpack to output both a "regular" and "debug" log at the same time in such a way that allows the debug log to be visible after the build (or at least accessible if the build fails).

The theoretical workflow could look like this:

Lifecycle could implement this "debug logs later" feature by exposing a file on disk that buildpacks can write to. The open question would be: Where would we store that information afterward so it can be retrieved? But before jumping into implementation...

What do you think @dmikusa @jjbustamante? Would such a feature would be valuable to CNB consumers? Would you implement support for it as a buildpack author if it existed? Any questions about the problem space or my description?

dmikusa commented 2 months ago

What do you think @dmikusa?

👍

Would such a feature would be valuable to CNB consumers?

Yes, I think @samj1912 and I had talked/dreamed about something like this before.

Would you implement support for it as a buildpack author if it existed?

100%

Any questions about the problem space or my description?

No, I think you covered it well.

Having said all that, even with the feature you're proposing, I'm not sure I would always 100% of the time log everything we log at the debug level now for Pakteo buildpacks. There are some things that might be a performance hit to gather the info and log. For that reason, I think it would still be nice to know if there is a verbose or debug mode enabled so that the buildpack can know how much output to write.

I'd say that's probably a small number of things, probably 90% of the stuff we log at debug we could just log all the time if it were hidden by default as you describe. The other 10%, idk, maybe. We do some things like walking a directory to log the contents of what's in the buildpack and some other folders. It's helpful for debugging, but walking a directory isn't free. I worry that might slow things down for people, especially if you have lots of files in the app. I could also be overthinking this, and maybe it won't make a noticeable impact. I like this idea though and I'd be willing to try and see how it goes. Worst case, we keep using the env flags that we use now for debug logging.

schneems commented 2 months ago

Thanks for the fast feedback!

We do some things like walking a directory to log the contents of what's in the buildpack and some other folders

Excellent point. We do (some) stuff like this as well, but usually only triggered on a failure so that it's not in the hot path most of the time.

I could also be overthinking this, and maybe it won't make a noticeable impact.

Possibly. The last time I did intensive profiling the biggest performance costs were around shelling out and forking (for the ruby classic buildpack https://github.com/heroku/heroku-buildpack-ruby) while disk access isn't free, as you mention, it was cheaper than I would have thought. For each Ruby build, we traverse all generated assets in the cache to grab their file size and evict the oldest once they're over a given threshold https://github.com/heroku/heroku-buildpack-ruby/blob/7866d1efbdc0e4d5f66caff2cdd7f83f76d4f8e0/lib/language_pack/rails4.rb#L99. It's not really perceptible during deploys. Of course it's also dependent on hardware, what else the buildpack is doing, and the expectations of developers in different communities. Rubyists tend to be tolerant of a slower solution that is "a batteries included" experience.

What I'm hearing, though, is that my proposal wouldn't supersede the desire to have conditional debugging logic (if my solution existed, it would be an addition, not a replacement for logging level). I think that 100% makes sense to start.

dmikusa commented 2 months ago

What I'm hearing, though, is that my proposal wouldn't supersede the desire to have conditional debugging logic (if my solution existed, it would be an addition, not a replacement for logging level). I think that 100% makes sense to start.

Yes, I think that's how I'd be leaning at the moment. I do like what you're proposing.