rapidsai / dependency-file-generator

https://pypi.org/project/rapids-dependency-file-generator/
Apache License 2.0
15 stars 13 forks source link

feat: Add support for writing pyproject.toml files #33

Closed vyasr closed 1 year ago

vyasr commented 1 year ago

This PR adds support for in-place modification of pyproject.toml files. The modification cannot be done out of place in this case because only a subset of the file is dependencies and there is no way for a pyproject.toml file to point to another location for its dependencies.

On the positive side, the changeset here is quite small and easily manageable. OTOH, there are a number of open design questions that should be addressed before merging this PR. Note that some of these considerations are the same as for conda meta.yaml files in #28. I would consider pyproject.toml to be the easier of the two and require solving a subset of the problems needed for meta.yaml. Essentially all of the problems below exist in similar form for those files, so figuring out how pyproject.toml support should look will get us a large part of the way towards enabling meta.yaml.

  1. The interpretation of the files key of dependencies.yaml is somewhat suspect when used for pyproject.toml because there is no sensible reason to want to rename the file (this is also true for meta.yaml)
  2. For the optional dependencies, we need a way to specify the key of the optional dependencies section. I am currently using the file name for this purpose, but this may be problematic because the file name may not be unique i.e. we might need optional test dependencies in pyproject.toml as well as a test_*.yaml env file (This same problem exists for meta.yaml due to the different host/build/run/test sections). Furthermore, I've currently hacked this in by passing the filename key into the make_dependency_file function even though that function doesn't need it for anything else because the output filename has already been generated and is one of the inputs here (this is also why one test is currently failing). If we move forward with this approach, that code will need a bit of refactoring.
  3. I am currently splitting up the build requirements, the run dependencies, and the optional dependencies as different OutputTypes, which is functional but somewhat contrived as being distinct from the different sections in (2).

EDIT The above concerns have mostly been addressed. We are now using a single output type for all pyproject sections and allowing a new extras section that can be used to provide data specific to pyproject.toml, in this case the table and key under which to store a particular dependency list. This addresses points 2 and 3 above. Point 1 is still somewhat unaddressed, but is mitigated by better documenting the behavior.

vyasr commented 1 year ago

This can be tested using https://github.com/rapidsai/rmm/pull/1219

vyasr commented 1 year ago

All looks fine to me — if we want to go this direction and allow inplace edits of a file. Are you happy with tomlkit’s handling of comments, whitespace, etc. in the file?

Yes, at least for the very limited modifications that we need to make (just a few dependency lists) tomlkit is sufficiently format-preserving for our purposes based on my (admittedly limited) testing. I chose it because it is the official Python recommendation for a style-preserving TOML library.

I would like to see some regression tests or other forms of testing for this draft before approving.

Agreed, I have no plans to merge this without testing. However, what are your thoughts on the questions I posted in the PR? I would like to address those first before writing new tests in case they result in any meaningful changes to APIs etc.

csadorf commented 1 year ago
  1. The interpretation of the files key of dependencies.yaml is somewhat suspect when used for pyproject.toml because there is no sensible reason to want to rename the file (this is also true for meta.yaml)

I see three ways to resolve this:

  1. (breaking) We do not use the key to determine the prefix, but instead add a "prefix" keyword to the files entries schema.
  2. (non-breaking) Add a new top-level category for "files" that don't exactly follow the current pattern.
  3. (non-breaking) Recognize that some type of generated files simply do not use the prefix.

I would advocate for solution 3 as the path with least resistance. All we have to do is update the docs to say that the key has meaning for some outputs, because it is used as a prefix. We could also implement some mix of 1 and 3, where if you provide a "prefix" keyword which defaults to the key were needed.

  1. For the optional dependencies, we need a way to specify the key of the optional dependencies section. I am currently using the file name for this purpose, but this may be problematic because the file name may not be unique i.e. we might need optional test dependencies in pyproject.toml as well as a test_*.yaml env file (This same problem exists for meta.yaml due to the different host/build/run/test sections). Furthermore, I've currently hacked this in by passing the filename key into the make_dependency_file function even though that function doesn't need it for anything else because the output filename has already been generated and is one of the inputs here (this is also why one test is currently failing). If we move forward with this approach, that code will need a bit of refactoring.

Why not simply add another property for this purpose? I've slightly modified the rmm example here:

  py_test:
    output: pyproject_optional_dependencies
    extra:
        section: test
    includes:
      - test_python
  1. I am currently splitting up the build requirements, the run dependencies, and the optional dependencies as different OutputTypes, which is functional but somewhat contrived as being distinct from the different sections in (2).

Yeah, it seems a bit clunky, how about this:

  py_build:
    output: pyproject
    extra:
        type: build
    includes:
      - build
      - cuda-python
  py_run:
    output: pyproject  # defaults to standard dependencies
    includes:
      - run
      - cuda-python
  py_test:
    output: pyproject
    extra:
      type: optional
      section: test
    includes:
      - test_python

Can you add an example file that is then also tested against? I think that would also make it easier to understand and address some of the current issues.

vyasr commented 1 year ago

I've applied the proposed adjustments. Will need to do a bit more work to update tests (these tests will require copying in preexisting files for modification rather than writing from scratch) then I think we should be good to go.

ajschmidt8 commented 1 year ago

:tada: This PR is included in version 1.5.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket:

ajschmidt8 commented 1 year ago

After rerunning the deploy workflow below, this new package version is now in our CI images