containerbuildsystem / cachi2

Cachi2 is a CLI tool that pre-fetches your project's dependencies to aid in making your build process network-isolated.
GNU General Public License v3.0
8 stars 26 forks source link

bundler: parse Gemfile.lock #599

Closed slimreaper35 closed 1 month ago

slimreaper35 commented 2 months ago

Maintainers will complete the following section

Note: if the contribution is external (not from an organization member), the CI pipeline will not run automatically. After verifying that the CI is safe to run:

slimreaper35 commented 2 months ago

New push:

slimreaper35 commented 1 month ago

New push:

eskultety commented 1 month ago
* I dropped `urlparse` since there is no validation of URL at all, `urlparse` accepts empty strings btw
  I wouldn't rely on the bundler library here, neither

Good to know, anything in particular to share?

slimreaper35 commented 1 month ago

Good to know, anything in particular to share?

Yeah, forgot to add examples. I was testing our lockfile_parser.rb script what kind of malformed URLs it can handle, and:

eskultety commented 1 month ago

FYI the unit tests fail in local environments if Ruby isn't installed which is currently assumed as a hard dependency and it should not be. Therefore, there's something going on with the gating CI, because it clearly has Ruby installed which is wrong, it should be a completely pristine environment.

Sigh. I now see the shortcoming of the solution based on the Ruby script. However, I'm still in favour of it due to its simplicity and clarity over the original solution based on some random 3rd party library (or even worse our own vendored version of it) which most likely will end up as an abandonware . I'd solve the unit test issue with the following:

@pytest.mark.skip("Gemfile.lock parser is integration tested due to its dependency on the Ruby runtime")
def test_gemfile_parse() -> None:
    pass

in order to:

  1. not lose coverage otherwise codecov will complain in its reports that we have an uncovered function which would not be true
  2. we need to skip it so that we don't report success on something that doesn't do anything.

Any objections or alternative proposals?

brunoapimentel commented 1 month ago

there's something going on with the gating CI

Ohh, good catch!

I'd solve the unit test issue with the following:

@pytest.mark.skip("Gemfile.lock parser is integration tested due to its dependency on the Ruby runtime")
def test_gemfile_parse() -> None:
    pass

Any objections or alternative proposals?

We don't we simply mock the run_cmd call, and test the rest of the function with mocked output? I assume we're fine with not unit testing the lockfile_parser.rb per se, so this would allow us to test all the rest of the Python code in bundler/parser.py.

eskultety commented 1 month ago
@pytest.mark.skip("Gemfile.lock parser is integration tested due to its dependency on the Ruby runtime")
def test_gemfile_parse() -> None:
    pass

Taking this idea back - integration testing this is not a way. We'd still have to clone a repo with a particular ref just to test these, which doesn't feel right.

We don't we simply mock the run_cmd call, and test the rest of the function with mocked output? I assume we're fine with not unit testing the lockfile_parser.rb per se, so this would allow us to test all the rest of the Python code in bundler/parser.py.

We could do that, yes. Then again, it's our script that we're going to maintain, so we should test its correctness as well. Continuing with your thought we could have this particular test that mocks run_cmd and tests the rest of the python code as you suggest and then we could have an additional test for the Ruby parser which would require Ruby installed in some form which is an idea I discussed privately with @a-ovchinnikov:

That said, if everyone is in favour of including expensive tests as well, we can track that idea as a future improvement, doesn't necessarily need to be part of this particular PR.

slimreaper35 commented 1 month ago

I just added a commit that skips the tests that depend on Ruby. I did not want to mock the run_cmd output with huge JSON output because it is, in the end, a temporary change.

eskultety commented 1 month ago

I just added a commit that skips the tests that depend on Ruby. I did not want to mock the run_cmd output with huge JSON output because it is, in the end, a temporary change.

Doesn't matter. You can define how big you want the JSON (I think you can take a very simple example like 2 dependencies and it won't be that big, will it?), we just want to make sure we cover ideally all python code we introduce. I don't care that much about the output being mocked for now, but I do care about testing our behaviour and agree with @brunoapimentel we should do it.

slimreaper35 commented 1 month ago

Should I add the change as a separate commit or part of the one that implements parsing? I thought that adding a separate commit would make it easy to revert it in the future.

eskultety commented 1 month ago

Should I add the change as a separate commit or part of the one that implements parsing? I thought that adding a separate commit would make it easy to revert it in the future.

Same commit, with a justification. We will not revert it, we'll just update the test in the future such that it:

a-ovchinnikov commented 1 month ago

/retest