chainguard-dev / melange

build APKs from source code
Apache License 2.0
380 stars 83 forks source link

Add flag to preserve original PyPI URIs #1281

Closed egibs closed 3 days ago

egibs commented 2 weeks ago

Melange Pull Request Template

Relates to: https://github.com/chainguard-dev/melange/pull/761

Functional Changes

Notes:

SCA Changes

Notes:

Linter

Notes:

I was using the convert functionality today and saw several 404s for PyPI packages/versions that should exist, e.g.,

❯ go run . convert python typing-extensions --python-version 3.11
2024/06/13 17:23:53 INFO generating convert config files for python package typing-extensions version: 3.11 on python version:
2024/06/13 17:23:53 INFO [typing-extensions] Generating manifests
2024/06/13 17:23:53 INFO [typing-extensions] Retrieving Package information from https://pypi.org
2024/06/13 17:23:53 INFO [typing-extensions] Check Dependency list: [typing-extensions]
2024/06/13 17:23:53 INFO [typing-extensions] Fetch Package Data
2024/06/13 17:23:54 INFO [typing-extensions] typing-extensions Add to generate list
2024/06/13 17:23:54 INFO [typing-extensions] Check for dependencies
2024/06/13 17:23:54 INFO [typing-extensions] Searching source for dependencies
2024/06/13 17:23:54 INFO [typing-extensions] 0 Number of deps
2024/06/13 17:23:54 INFO [typing-extensions] Generating 1 files
2024/06/13 17:23:54 INFO [typing-extensions] Index typing-extensions Package typing-extensions
2024/06/13 17:23:54 INFO [typing-extensions] Create manifest
2024/06/13 17:23:54 INFO Trying to get commit data for typing-extensions
2024/06/13 17:23:54 INFO [typing-extensions] Generate Package
2024/06/13 17:23:54 INFO [typing-extensions] Run time Deps []
2024/06/13 17:23:54 INFO [typing-extensions] Generate Environment
2024/06/13 17:23:54 INFO [typing-extensions] Generate Pipeline for version 4.12.2
2024/06/13 17:23:54 INFO Getting artifact https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz
2024/06/13 17:23:54 INFO client.Do("https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz")
2024/06/13 17:23:54 INFO [typing-extensions] SHA256 Generation FAILED. 404 when getting https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz
2024/06/13 17:23:54 INFO [typing-extensions]  Or try 'curl ' to check out the API
2024/06/13 17:23:54 INFO [typing-extensions] FAILED TO CREATE MANIFEST artifact 256SHA FAILED GENERATION. Investigate by going to https://pypi.org/project/typing-extensions/ did not match Package data SHA256 1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8
2024/06/13 17:23:54 INFO error during command execution: artifact 256SHA FAILED GENERATION. Investigate by going to https://pypi.org/project/typing-extensions/ did not match Package data SHA256 1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8
exit status 1

That is, https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz does not exist:

❯ curl -w "%{http_code}" https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz
404

Removing this: https://github.com/chainguard-dev/melange/blob/270e7a70f449483b8fc30648cfd239a3828cb5be/pkg/convert/python/python.go#L403-L408 allowed the conversion to work:

❯ go run . convert python typing-extensions --python-version 3.11
2024/06/13 17:26:37 INFO generating convert config files for python package typing-extensions version: 3.11 on python version:
2024/06/13 17:26:37 INFO [typing-extensions] Generating manifests
2024/06/13 17:26:37 INFO [typing-extensions] Retrieving Package information from https://pypi.org
2024/06/13 17:26:37 INFO [typing-extensions] Check Dependency list: [typing-extensions]
2024/06/13 17:26:37 INFO [typing-extensions] Fetch Package Data
2024/06/13 17:26:38 INFO [typing-extensions] typing-extensions Add to generate list
2024/06/13 17:26:38 INFO [typing-extensions] Check for dependencies
2024/06/13 17:26:38 INFO [typing-extensions] Searching source for dependencies
2024/06/13 17:26:38 INFO [typing-extensions] 0 Number of deps
2024/06/13 17:26:38 INFO [typing-extensions] Generating 1 files
2024/06/13 17:26:38 INFO [typing-extensions] Index typing-extensions Package typing-extensions
2024/06/13 17:26:38 INFO [typing-extensions] Create manifest
2024/06/13 17:26:38 INFO Trying to get commit data for typing-extensions
2024/06/13 17:26:38 INFO [typing-extensions] Generate Package
2024/06/13 17:26:38 INFO [typing-extensions] Run time Deps []
2024/06/13 17:26:38 INFO [typing-extensions] Generate Environment
2024/06/13 17:26:38 INFO [typing-extensions] Generate Pipeline for version 4.12.2
2024/06/13 17:26:38 INFO Getting artifact https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz
2024/06/13 17:26:38 INFO client.Do("https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz")
2024/06/13 17:26:38 INFO Generated melange config: py3.11-typing-extensions.yaml

This PR adds a flag to ignore the URI conversion for problematic packages:

❯ go run . convert python typing-extensions --python-version 3.11 --preserve-base-uri
2024/06/13 17:39:56 INFO generating convert config files for python package typing-extensions version: 3.11 on python version:
2024/06/13 17:39:56 INFO [typing-extensions] Generating manifests
2024/06/13 17:39:56 INFO [typing-extensions] Retrieving Package information from https://pypi.org
2024/06/13 17:39:56 INFO [typing-extensions] Check Dependency list: [typing-extensions]
2024/06/13 17:39:56 INFO [typing-extensions] Fetch Package Data
2024/06/13 17:39:57 INFO [typing-extensions] typing-extensions Add to generate list
2024/06/13 17:39:57 INFO [typing-extensions] Check for dependencies
2024/06/13 17:39:57 INFO [typing-extensions] Searching source for dependencies
2024/06/13 17:39:57 INFO [typing-extensions] 0 Number of deps
2024/06/13 17:39:57 INFO [typing-extensions] Generating 1 files
2024/06/13 17:39:57 INFO [typing-extensions] Index typing-extensions Package typing-extensions
2024/06/13 17:39:57 INFO [typing-extensions] Create manifest
2024/06/13 17:39:57 INFO Trying to get commit data for typing-extensions
2024/06/13 17:39:57 INFO [typing-extensions] Generate Package
2024/06/13 17:39:57 INFO [typing-extensions] Run time Deps []
2024/06/13 17:39:57 INFO [typing-extensions] Generate Environment
2024/06/13 17:39:57 INFO [typing-extensions] Generate Pipeline for version 4.12.2
2024/06/13 17:39:57 INFO Getting artifact https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz
2024/06/13 17:39:57 INFO client.Do("https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz")
2024/06/13 17:39:58 INFO Generated melange config: py3.11-typing-extensions.yaml

versus:

❯ go run . convert python typing-extensions --python-version 3.11
2024/06/13 17:40:03 INFO generating convert config files for python package typing-extensions version: 3.11 on python version:
2024/06/13 17:40:03 INFO [typing-extensions] Generating manifests
2024/06/13 17:40:03 INFO [typing-extensions] Retrieving Package information from https://pypi.org
2024/06/13 17:40:03 INFO [typing-extensions] Check Dependency list: [typing-extensions]
2024/06/13 17:40:03 INFO [typing-extensions] Fetch Package Data
2024/06/13 17:40:04 INFO [typing-extensions] typing-extensions Add to generate list
2024/06/13 17:40:04 INFO [typing-extensions] Check for dependencies
2024/06/13 17:40:04 INFO [typing-extensions] Searching source for dependencies
2024/06/13 17:40:04 INFO [typing-extensions] 0 Number of deps
2024/06/13 17:40:04 INFO [typing-extensions] Generating 1 files
2024/06/13 17:40:04 INFO [typing-extensions] Index typing-extensions Package typing-extensions
2024/06/13 17:40:04 INFO [typing-extensions] Create manifest
2024/06/13 17:40:04 INFO Trying to get commit data for typing-extensions
2024/06/13 17:40:04 INFO [typing-extensions] Generate Package
2024/06/13 17:40:04 INFO [typing-extensions] Run time Deps []
2024/06/13 17:40:04 INFO [typing-extensions] Generate Environment
2024/06/13 17:40:04 INFO [typing-extensions] Generate Pipeline for version 4.12.2
2024/06/13 17:40:04 INFO Getting artifact https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz
2024/06/13 17:40:04 INFO client.Do("https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz")
2024/06/13 17:40:04 INFO [typing-extensions] SHA256 Generation FAILED. 404 when getting https://files.pythonhosted.org/packages/source/t/typing-extensions/typing-extensions-4.12.2.tar.gz
2024/06/13 17:40:04 INFO [typing-extensions]  Or try 'curl ' to check out the API
2024/06/13 17:40:04 INFO [typing-extensions] FAILED TO CREATE MANIFEST artifact 256SHA FAILED GENERATION. Investigate by going to https://pypi.org/project/typing-extensions/ did not match Package data SHA256 1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8
2024/06/13 17:40:04 INFO error during command execution: artifact 256SHA FAILED GENERATION. Investigate by going to https://pypi.org/project/typing-extensions/ did not match Package data SHA256 1a7ead55c7e559dd4dee8856e3a88b41225abfe1ce8df57b7c13915fe121ffb8
exit status 1

Since this flag is opt-in, the Python tests will still pass without any modifications (ignoring the FIPS failures which aren't related):

❯ go test -tags e2e ./... -race
?       chainguard.dev/melange  [no test files]
?       chainguard.dev/melange/docs [no test files]
?       chainguard.dev/melange/internal/contextreader   [no test files]
?       chainguard.dev/melange/internal/gen-jsonschema  [no test files]
?       chainguard.dev/melange/internal/logwriter   [no test files]
?       chainguard.dev/melange/pkg/cli  [no test files]
?       chainguard.dev/melange/pkg/container    [no test files]
?       chainguard.dev/melange/pkg/container/dagger [no test files]
?       chainguard.dev/melange/pkg/container/docker [no test files]
?       chainguard.dev/melange/pkg/convert/github   [no test files]
?       chainguard.dev/melange/pkg/convert/relmon   [no test files]
?       chainguard.dev/melange/pkg/http [no test files]
?       chainguard.dev/melange/pkg/linter/defaults  [no test files]
?       chainguard.dev/melange/pkg/manifest [no test files]
?       chainguard.dev/melange/pkg/renovate [no test files]
?       chainguard.dev/melange/pkg/renovate/cache   [no test files]
ok      chainguard.dev/melange/pkg/build    1.560s
ok      chainguard.dev/melange/pkg/cond (cached)
ok      chainguard.dev/melange/pkg/config   1.172s
ok      chainguard.dev/melange/pkg/convert/apkbuild 1.755s
ok      chainguard.dev/melange/pkg/convert/gem  1.589s
ok      chainguard.dev/melange/pkg/convert/python   7.759s
ok      chainguard.dev/melange/pkg/convert/wolfios  (cached)
ok      chainguard.dev/melange/pkg/index    (cached)
ok      chainguard.dev/melange/pkg/linter   1.614s
ok      chainguard.dev/melange/pkg/renovate/bump    1.315s
ok      chainguard.dev/melange/pkg/sbom 1.390s
--- FAIL: TestGoFipsBinDeps (0.00s)
    e2e_test.go:47: open testdata/go-fips-bin/packages/aarch64/go-fips-bin-v0.0.1-r0.apk: no such file or directory
FAIL
FAIL    chainguard.dev/melange/pkg/sca  0.515s
ok      chainguard.dev/melange/pkg/sign 1.356s
ok      chainguard.dev/melange/pkg/util (cached)
FAIL

I also added a test for the new functionality and a problematic package:

❯ go test -v -tags e2e -timeout 30s -run "^TestGenerateManifestPreserveURI\$" chainguard.dev/melange/pkg/convert/python
=== RUN   TestGenerateManifestPreserveURI
    slogtest.go:20: time=2024-06-14T17:21:22.387-05:00 level=INFO msg="[typing-extensions] Generate Package"

    slogtest.go:20: time=2024-06-14T17:21:22.387-05:00 level=INFO msg="[typing-extensions] Run time Deps []"

    slogtest.go:20: time=2024-06-14T17:21:22.387-05:00 level=INFO msg="[typing-extensions] Generate Environment"

    slogtest.go:20: time=2024-06-14T17:21:22.387-05:00 level=INFO msg="[typing-extensions] Generate Pipeline for version 4.12.2"

    slogtest.go:20: time=2024-06-14T17:21:22.387-05:00 level=INFO msg="Getting artifact https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz"

    slogtest.go:20: time=2024-06-14T17:21:22.388-05:00 level=INFO msg="client.Do(\"https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz\")"

    slogtest.go:20: time=2024-06-14T17:21:22.550-05:00 level=INFO msg="[typing-extensions] Generate Package"

    slogtest.go:20: time=2024-06-14T17:21:22.550-05:00 level=INFO msg="[typing-extensions] Run time Deps []"

    slogtest.go:20: time=2024-06-14T17:21:22.550-05:00 level=INFO msg="[typing-extensions] Generate Environment"

    slogtest.go:20: time=2024-06-14T17:21:22.550-05:00 level=INFO msg="[typing-extensions] Generate Pipeline for version 4.12.2"

    slogtest.go:20: time=2024-06-14T17:21:22.550-05:00 level=INFO msg="Getting artifact https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz"

    slogtest.go:20: time=2024-06-14T17:21:22.550-05:00 level=INFO msg="client.Do(\"https://files.pythonhosted.org/packages/df/db/f35a00659bc03fec321ba8bce9420de607a1d37f8342eee1863174c69557/typing_extensions-4.12.2.tar.gz\")"

--- PASS: TestGenerateManifestPreserveURI (0.19s)
PASS
ok      chainguard.dev/melange/pkg/convert/python   0.365s
egibs commented 2 weeks ago

cc: @joshrwolf @krishjainx

krishjainx commented 6 days ago

@egibs CI kicked off. Let's talk on Monday. I can give it a full review then!