paketo-buildpacks / libpak

An opinionated extension to the libcnb Cloud Native Buildpack Library
Apache License 2.0
15 stars 17 forks source link

adds logic to remove http query params from destination file name #273

Closed semmet95 closed 12 months ago

semmet95 commented 1 year ago

Summary

This PR adds changes to ensure that downloaded http type dependency artifacts are given a filename that does not include the query parameters that might be present in the corresponding URI.

Use Cases

fixes #274

Checklist

linux-foundation-easycla[bot] commented 1 year ago

CLA Signed

The committers listed above are authorized under a signed CLA.

dmikusa commented 1 year ago

Hi, I took a look. A couple of things on this:

  1. It needs test coverage. There might be existing tests covering HTTP and file downloads, but we need to make sure there's one also with query params and confirm that the expected file name is produced.

  2. Thinking through test scenarios, I also worry about the case where you have a URL like https://www.example.com/download.php?give-me=something. If we throw out the query params, you could have collisions with downloads and that wouldn't be good. It might make more sense to just sha256 hash the full URL and use that for the file name. I don't like that it creates meaningless file names, but it will ensure a unique file name for the download URL. Is this a change you'd be willing to make?

semmet95 commented 1 year ago

Thanks for the suggestions @dmikusa I'll look into them and update the PR accordingly

semmet95 commented 1 year ago

I checked just to be sure that the apache-tomcat buildpack doesn't need the file name to have the proper extension and even with the sha256 as its name the buildpack managed to download and extract the tomcat configuration tar file. We are working on the suggested changes now.

semmet95 commented 1 year ago

@dmikusa pushed the changes. Please let me know if any improvements can be made.

pranavek commented 1 year ago

@dmikusa The changes are implemented as per your suggestions. Could you take a look at it?

NB: Some Github workflows are in Waiting for status state.

anthonydahanne commented 1 year ago

I've allowed the action to run and I rebased on main.

Looks good to me, @dmikusa WDYT?

dmikusa commented 12 months ago

Thanks for this PR & sticking with it. If you're interested, we'll need to port this to the release-2.x branch as well.

c0d1ngm0nk3y commented 11 months ago

Some feedback: This change was breaking. I wonder why this was brought into v1. Just admitting it in the release notes is not the best user experience. At least I do not read the release notes carefully for each minor bump.

dmikusa commented 11 months ago

@c0d1ngm0nk3y Apologizes. We try to screen out any breaking changes, but this one slipped through. We wanted to get this into 1.x because it impacted users. In hindsight, we'd have held it for 2.x.

c0d1ngm0nk3y commented 11 months ago

@dmikusa But doesn't make it clear that some integration test was missing. I mean it was not the case that one very specific edge case was broken, but offline buildpacks, didn't work in general, right? It feels to me that this should be tested somehow.

What wonders me more is not the fact that this was overseen - mistakes happen, that normal imho. But rather the reaction. Knowing that this is breaking but still keep the code and not reverting the change was strange to me.

My POV: I discovered that the integration test for one of our use cases broke. After some debugging, I found out that it is nothing we changed, but that offline buildpacks do not work in general. When checking the release notes of create-package, I saw that this problem is known and there there is even a notice buried in the release notes of a minor patch. The experience was not the best and it felt a bit unnecessary to me.

dmikusa commented 11 months ago

But doesn't make it clear that some integration test was missing.

100%. We have stories to add integration testing to the composite buildpacks, but the work is incomplete. We'd be happy to work with you on that, if it's something you'd like to contribute.

What wonders me more is not the fact that this was overseen - mistakes happen, that normal imho. But rather the reaction. Knowing that this is breaking but still keep the code and not reverting the change was strange to me.

What you're seeing is absolute transparency and us working through this problem in real-time. We could have taken the bug report, sat on this for a week or so while we investigated the full extent of the problem, made a decision, and then finally communicated that to the community. My approach for the community is to communicate early and often. There are some drawbacks to that because we're working through the issue in real-time, but I think the benefit of being transparent with everything outweighs the approach of keeping things private until they've been fully sorted out.

That said. here's the latest update. We're reevaluating this. Another bug case has popped up. Previously, the decision was made to try and keep moving forward because we wanted to be able to retain the original fix that #273 provided. Additional failure scenarios have popped up. We're presently discussing those, the potential impact on the user base, and the work required to resolve them. We'll keep folks updated when a course of action has been decided upon.

The recommendation is still to continue using the older version of create-package and libpak. This should avoid issues because both are using the pre-#273 code.

pivotal-david-osullivan commented 11 months ago

Having investigated the bug raised in #288 as well as further failures seen, we've decided to revert the change in this PR to minimise impact and avoid introducing other issues by attempting to patch the new logic in the short term.

Hopefully the problem scenario faced in #274 can be resolved in v2 of Libpak (#287)