python-poetry / poetry

Python packaging and dependency management made easy
https://python-poetry.org
MIT License
31.06k stars 2.25k forks source link

Monorepo / Monobuild support? #936

Closed epage closed 1 year ago

epage commented 5 years ago

Feature Request

The goal is to allow a developer to make changes to multiple python packages in their repo without having to submit and update lock files along the way.

Cargo, which poetry seems to be modeled after, supports this with two features

It looks like some monobuild tools exist for python (buck, pants, bazel) but

floer32 commented 5 years ago

[I'm just somebody driving by]

I've had to deal with the monorepo (anti)pattern1️⃣ and I agree it'd be very helpful to have help in dealing with it and packaging it; but also have learned it's really hard to do good things for monorepo without making compromises to normal patterns.

To me it sounds like this could be hard to get into Poetry, at least right from the start. It's easier to imagine another library where someone chooses an option like buck/pants/bazel and then creates a helper to make that work well with Poetry and vice/versa

1️⃣: it's not always an antipattern, I know. too often, it is, and many best practices are abandoned. So it can make it hard to develop monorepo-related features without specific good examples that are targetted for support. TLDR could be good to link to an OSS example (or contrive one and link to that)

epage commented 5 years ago

I understand. I hate the religious view taken towards monorepos. I have found what I call mini-monorepos to be useful, small repos that serve a single purpose.

For example, in the Rust world, I have a package for generic testing of conditions, called predicates. I've split it into 3 different packages, predicates-core to define the interfaces, predicates, and predicates-tree for rendering predicate failures in a way similar to pytest. I did these splits so that (1) people aren't forced into dependencies they don't need and (2) Rust is thankfully strict on semver and so the split also represents a splitting of compatibility guarantees. It is more important to provide a stable API for vocab terms (predicates-core) than for implementations that aren't generally passed around.

I specially suggested continuing to follow after Cargo's model, like poetry has done in other ways, for monobuild support rather than getting into the more complex requirements related to tools like buck/pants/bazel (fast as possible, vendor-all-the-deps, etc). If someone needs the requirements of those tools, they should probably just use those tools instead. From at least my brief look, it seemed like they don't do a good job of interoperating with other python build / dependency management systems.

davidroeca commented 5 years ago

Also would love support here! Specifically for developing a library that has different components each with their own set of potentially bulky dependencies (e.g. core, server type 1, server type 2, client, etc.). Also helpful when trying to expose the same interface that supports different backends (similarly, you don't want to install every backend, just the one you want).

The only OSS library I could find that emulates this approach is toga -- it would be great if poetry could handle dependency resolution for these sorts of libraries.

The toga quickstart explains how the dependencies are managed with a bunch of setup.py files.

NGaffney commented 5 years ago

I'm interested in this also. How far does the current implementation of editable installs get you towards your use case?

epage commented 5 years ago

I'm interested in this also. How far does the current implementation of editable installs get you towards your use case?

Use cases

Rust can solve this at two levels

So regarding the first, editable installs might cover this if you can mix path dependencies with version dependencies which is the key feature needed to partially handle my specified use cases.

NGaffney commented 5 years ago

I tried some of this out today and it looks like the first feature you describe is nearly, but not quite supported.

You can declare a dependency both as a dev and non-dev dependency, when you do this the dev version should take precedence (at least that's how I interpret this), allowing a package to be installed as editable during local dev. Then for final build the --no-dev flag would remove the dev version from consideration.

I've made it work with a toy example which contains a path dependency in dev and non-dev mode but not for a local vs pypi dependency.

guillaumep commented 5 years ago

I am currently trying to find a way to build a Python monorepo without using heavy-duty tools like Bazel.

I have seen a repository which uses yarn and lerna from the JavaScript world to build Python packages:

https://github.com/pymedphys/pymedphys

Using yarn you can declare workspaces, and using lerna you can execute commands in all workspaces. You can then use the script section of package.json in order to run Python build commands.

I've yet to try this technique with Poetry (and the pymedphys repository does not use it), however I feel it might be worth exploring.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

davidroeca commented 4 years ago

I still think this could be useful

daveisfera commented 4 years ago

yarn workspaces makes it possible to have a set of shared libs and that are used across multiple apps in a monorepo, so I think it's a good model to follow or at least borrow ideas from

remram44 commented 4 years ago

Something I am running into right now is that using Poetry with path dependencies is very unfriendly to the Docker cache.

To correctly leverage the cache (read: don't install all dependencies from scratch on every build) I would want to install the dependencies first, and then copy my own code over and install it. However Poetry refuses to do anything (even poetry export -f requirements.txt) if my code has not already been copied in.

I am considering writing my own parser for the poetry.lock file, to pip install my dependencies from there before I copy my code over.

One of two things could help me:

kbakk commented 4 years ago

~It will be interesting to see how https://github.com/python-poetry/poetry/issues/1993 will be solved - it does imply the Poetry repo becoming a kind of monorepo, right?~ Edit: Err, sorry, I just assumed it would be done in the current repo, but it seems to not follow a monorepo structure (python-poetry/core)

TheButlah commented 3 years ago

Any progress on this feature request? I've been using poetry for about a year now inside a monorepo in a research lab. We have a single global pyproject.toml that dictates the global set of dependencies since there isn't a way to have multiple separate pyproject.toml files (like cargo workspaces).

Needless to say, this creates a lot of pain, because we continually experience problems when one developer wants a dependency that is incompatible with another developer's code. As a result, most of the researches go off the grid and don't use poetry at all, instead having their own venvs that only work on their machines.

abn commented 3 years ago

Related: #2270

@TheButlah this is something I wish to pick up sometime this year. Would be great to make sure the use case is listed in the issue.

KoltesDigital commented 3 years ago

I've quickly made a prototype for handling monorepos, with success. In fact it nearly works out of the box already! Are you interested if I make a PR?

I put here some unordered details and thoughts. I don't guarantee it'd resolve all cases, but at least it's fine for my needs. I'm basing this workflow on the one I use for JS/TS projects using yarn and lerna.

Repo structure:

./
 |- .git/
 |- packages/
 |   |- foo/
 |   |   |- sources (foo.py or foo/__init__.py, with tests...)
 |   |   |- pyproject.toml
 |   |- bar/
 |   |   |- sources (...)
 |   |   |- pyproject.toml
 |- pyproject.toml

The idea is to have a private/virtual pyproject.toml at the root. Each package has its own pyproject.toml specifying the package name, version, dependencies, as usual. I could have created one virtualenv per package, but I find too cumbersome (I'm using VS Code and it would be very annoying to switch virtualenvs every time I change the file I'm working on), so I created a global virtualenv and that's the purpose of the root pyproject.toml.

This file contains all the project packages as sole dependencies (using the format foo = { path = "./packages/foo", develop = true }). Doing so, poetry resolves the dependencies of the packages. If bar depends on foo, packages/bar/pyproject.toml shall declare a regular version constraint (foo = "^1.2.3"). The root pyproject.toml has the dev dependencies (black, pytest...) since they are used for all packages, however the packages may have some as well (read below for CICD).

There is often the need for executing a command in every package directory. For instance, and this is already working, running poetry build or poetry publish from packages/foo successfully builds or publishes the package. Actually for these two use cases, I'm proposing new commands poetry packages build/publish, but a generic poetry packages exec -- args... would allow for doing anything else.

On the CICD server, it makes sense to create one virtualenv per package, at least for one thing: checking that each package only imports other packages that are declared as its dependencies. In that sense, it may make sense for packages to have their own dev dependencies, e.g. if they have some specific tests which are not shared with others.

I'm used to delegate version bumping to the CICD pipeline: every merged PR will automatically bump versions and publish packages, based on what has been changed (using conventional commits spec). Dependent packages need to be bumped as well: if bar depends on foo, bar hasn't changed but foo has some changes of any kind, bar still needs to get a patch bump. Actually this is only true if bar depends on the current version of foo: if an earlier version is specified, the dependency constraint is not updated.

Lerna takes care of git adding the changed files (package.json and CHANGELOG.md for every changed package) and can even git push... but thereafter, one may want to customize the commit message, so there's an option for that... well I think it's too much, the CICD pipeline can do it as well. I'd be happy if poetry could just bump versions, append messages in changelogs, and after that I can git add --all and git commit myself.

For publishing, lerna can retrieve the package versions on the registries, and publishes only the new ones. Here in Python this is tedious, since PyPi (warehouse) and pypiserver do not have a common route to get version info. My hack for now is just to publish everything, and I ignore 409: Conflict errors.

Caveats: packages shall not declare conflicting dependencies if there's only one global virtualenv.

Proposition

Again, what I'm proposing here addresses my needs, but this feature should fit it other use cases as well, so please give feedback about what you would need for your own workflow.

I prefer not to use "monorepo" in the names, which is too coercive.

New commands:

New pyproject.toml section:

[tool.poetry.packages]
paths = ["packages/*"] # used to specify where to find packages
version = "independent" # optional flag to let each package have their own version. Without this flag, all packages get the same version.
maneetgoyal commented 3 years ago

I am currently trying to find a way to build a Python monorepo without using heavy-duty tools like Bazel.

I have seen a repository which uses yarn and lerna from the JavaScript world to build Python packages:

https://github.com/pymedphys/pymedphys

Using yarn you can declare workspaces, and using lerna you can execute commands in all workspaces. You can then use the script section of package.json in order to run Python build commands.

I've yet to try this technique with Poetry (and the pymedphys repository does not use it), however I feel it might be worth exploring.

Yes indeed pymedphys was using the yarn script section at some point (till their v0.11.x) but then restructured their repo and migrated to using poetry only. Trying to dig into their justification for this move.


Edit: Some hints in https://github.com/pymedphys/pymedphys/issues/192#issuecomment-485613901. Have requested the author to shed some more light though.


Edit 2: Thanks to SimonBiggs. https://github.com/pymedphys/pymedphys/issues/192#issuecomment-739557570

hpgmiskin commented 3 years ago

@KoltesDigital I would really like to try out the modifications you have made to allow poetry to manage the dependencies for a monorepo.

Might you be able to share a branch to see these changes? You might have already shared but I just could not find it. Thanks!

KoltesDigital commented 3 years ago

@hpgmiskin thanks for your interest! My prototype is more a workflow PoC, I actually haven't changed poetry yet, instead I just added some scripts on top of it. These scripts are in JS/TS, and of course the final solution should only use Python. Moreover, I haven't implemented my whole proposition, and to do so I'll have to modify poetry.

So before doing this work, I want to first have some feedback from contributors, in order to know whether they're ok with the direction I'm heading to.

remram44 commented 3 years ago

What about having Poetry correctly update its lock file when a sub-pyproject.toml file changes, without having to run poetry lock from scratch? Or having all dependencies be collected in a single top-level poetry.lock (and possibly a single virtualenv), à la Cargo? Or installing dependencies without the code being there, for Docker cache and build system friendliness?

Those commands are nice, but I'm unlikely to use them, and I'd rather see the monorepo use-case be properly supported as a first step, and the opinionated utility commands (and CI integration) added as a second step.

davidroeca commented 3 years ago

@remram44 that alone would be a big step forward

KoltesDigital commented 3 years ago

@remram44 it's actually what happens with the root-level pyproject.toml, it creates a single virtual environnement and leads to a single root-level poetry.lock. It's indeed the same with Cargo workspaces, and Yarn workspaces. And this is already working without any of my additions, you can try the repo structure I described.

I also believe that poetry should not reinvent the wheel, and leave the things that CICD does best. That's why I mentioned that I prefer invoking git myself (i.e. the CICD takes care of that), because every CICD pipeline is different. But IMHO bumping versions of the subpackages is something everybody will need, that's why I propose to make this part into poetry. Or as an alternative, as a plug-in to poetry. After that, users can version, publish to you, etc. the way they want.

remram44 commented 3 years ago

If I run poetry add in a subdirectory, it will generate a poetry.lock there instead of updating the root-level one. Same with running poetry run or poetry shell in a subdirectory. This is different from what workspace-aware package managers do.

KoltesDigital commented 3 years ago

@remram44 exactly, that's why I've proposed new root-level commands poetry package <name> add/remove, which mimics Yarn or Cargo CLI features.

My experience with monorepos is that the users should not run commands from subdirectories. If one were to add from a subdirectory with either Yarn or Cargo, this would create undesired lock files too. And this would be reasonable. I don't expect a project manager tool to find out if the current project is actually part of a larger monorepo. Monorepo settings define which subdirectories are to be considered as subpackages (workspaces in package.json, Cargo.toml, and in my proposition), not the other way around.

remram44 commented 3 years ago

Cargo gets this right, I don't know why you say this is unreasonable. The shortcomings of some tools are no argument to ignore the behavior of working tools.

KoltesDigital commented 3 years ago

@remram44 interesting, I wasn't aware of this feature of Cargo. But well, we're having different views about what should the monorepo support look like. Our two views are different, but not exclusive, so let's have both.

davidroeca commented 3 years ago

Jumping in here regarding yarn behavior—running a yarn add in a yarn workspace does not generate a subdirectory lock file, though it does modify the subdirectory package.json. This may not be the case if you haven’t specified the subdirectory as a workspace, but I think what we’re hoping for with this issue is something that enables us to have a multi-package lockfile akin to the yarn workspaces feature (since the multi-lockfile monorepo is close to being/already supported?)

airtonix commented 3 years ago

I'm not so sure python can achieve the same level of awesome with monorepos that nodejs has been able to.

Let me explain, perhaps i've missed something about python, but basically it comes down to the fact that python only has two places where it stores deps for a project:

one of the reasons the monorepo pattern works really well for the nodejs world comes down to how npm/yarn tree shake installed dependencies and how nodejs resolves imports.

node_modules/
  glob/
    node_modules/

      minmatch/
        package.json
        index.js

    package.json
    index.js

packages/
  core/
    package.json
    index.js
  setup/
    node_modules/

      npm-run-all/
        package.json

      cookiecutter/
         package.json

    templates/
      new-cli/...
      new-gui/...
    package.json

apps/
  some-gui-app/
    package.json
    index.js

  some-cli/
    node_modules/

      @oclif/ui/
        index.js
        package.json

    package.json
package.json

considering that apps/some-cli has this package.json and that all other package.json not inside a node_modules has a similar name field:

{
  "name": "@airtonix/some-cli",
  "version": "0.0.0"
}

finally one feature of npm/yarn is the composability of the scripts field:

/package.json

...
"scripts": {
  "some-cli": "yarn workspace @airtonix/some-cli"
}
...

/apps/some-cli/package.json

...
"scripts": {
  "setup": "yarn micropy",
  "micropy": "micropy"
}
...

with these two package.json in place and using yarn workspace i can provide all kinds of command shortcuts on the cli when developing on the project:

~/Projects/Mine/m5paper
❯ yarn splash setup         
yarn run v1.22.10
$ yarn workspace @m5paper/splash-screen setup
warning package.json: No license field
$ yarn micropy --skip-checks
warning package.json: No license field
$ micropy --skip-checks

MicroPy  Loading Project
MicroPy  ✔ Ready!
Done in 1.50s.

the end effect is I can create for the monorepo a set of common tasks that means i don't have to move around folders just to execute the tasks.

remram44 commented 3 years ago

Private/multiple dependency versions have nothing to do with monorepos. I agree that they are something that's not possible on Python like it is on Node, but they seem a Python concern and not a Poetry one, and pretty different from what this issue is about...

I'd be happy if Poetry worked correctly with a single set of dependencies, as it is it tends to break when I make changes to sub-packages and force me to poetry lock (and often deleding .egg-info directories manually) to take them into account.

cognifloyd commented 3 years ago

@KoltesDigital There is a plugin system in master now (#3733 is merged!) which will make it into the next version.

I think the poetry packages ... set of commands (https://github.com/python-poetry/poetry/issues/936#issuecomment-734504568) would make an excellent plugin. Looking at your list of commands, I think these would probably be first (right? wdyt?):

The relevant section of the root toml (leaving versioning for the next set of commands)

[tool.poetry.packages]
paths = ["packages/*"] # used to specify where to find packages
# root = "./" # a relative path from this pyproject.toml to the dir containing the root pyproject.toml (default "./").

And perhaps add something to the packages' pyproject.toml so that you can use poetry add / remove ... within the project and have it update the root venv & lock file.

[tool.poetry.packages]
root = "../" # a relative path from this pyproject.toml to the dir containing the root pyproject.toml (default "./").

Also, we maybe this should be poetry subpackages ... instead of poetry packages ... to differentiate with packages that get built into the same wheel: https://python-poetry.org/docs/pyproject/#packages

KoltesDigital commented 3 years ago

@airtonix

While it's true that Nodejs can handle more granularity in dependencies, I'd say it's a detail. As a programmer, all you care is that dependencies are somehow resolved. Their exact locations on disk is rarely important. I agree that one downside of Python is conflicting dependencies: if two monorepo packages depend on the same external packages but for version ranges with no intersection, a virtualenv cannot contain both, and it'd be impossible to setup such monorepo. I'm afraid that's how Python has been designed and we can't do nothing about it.

Besides, it's not perfect neither with Yarn/NPM when it comes to the hoist optimization, because remote packages that get hoisted in the root node_modules can be included from every monorepo package, i.e. those which don't declare themselves the dependency in package.json. Basically there's too much freedom here: in such a case, it will work on a dev environment, but when released and imported into another project, runtime errors will occur (module not found). That's why it's recommended to use ESLint's import plugin, which checks that required/imported module are specified in the corresponding package.json, otherwise it warns: 'xyz' should be listed in the project's dependencies. Run 'npm i -S xyz' to add it / eslint(import/no-extraneous-dependencies). This issue will occur in Python monorepos too. I'm not aware of a similar tool for Python, but it'd be very handy.

@cognifloyd

Thanks for the plugin system news! I paused Python development for a while, but I'll start again soon, and I'll happily make this plugin. (or if someone goes first, go ahead)

I was not aware that tool.poetry.packages is already used, agreed. However I think the term subpackage already means something in Python, which is quite different of having a repository with multiple packages (their fully qualified names could be unrelated, although they should not). I suggest instead multi-packages, or to mirror Yarn, workspaces. What do you prefer? I prefer the latter, mostly because the CLI command will be shorter, and it may be more comfortable to use Yarn terminology. Note that Yarn's Workspaces denotes a feature (with an uppercase W), workspaces are the multiple packages, and the CLI commands start with workspace (without s).

cognifloyd commented 3 years ago

:+1: workspace

dermidgen commented 3 years ago

I've created an example repo that shows how to do this; without additional poetry support.
https://github.com/dermidgen/python-monorepo/

Incognito commented 3 years ago

fyi pants has some concept for poetry now, I haven't fully explored it, but on the surface it looks like it can get the job done.

https://www.pantsbuild.org/v2.6/docs/python-third-party-dependencies#poetry-integration

jacksmith15 commented 3 years ago

I prototyped something based on Yarn Workspaces and @KoltesDigital above ideas as a plugin: https://github.com/jacksmith15/poetry-workspace-plugin

Its already quite workable (you can currently install with poetry plugin add poetry-workspaces, although be aware of #4365 before installing !).

If you have any ideas about how it could be better, feel free to join the discussion.

dermidgen commented 3 years ago

@jacksmith15 I dig that hard. Very nice!

gerbenoostra commented 3 years ago

@dermidgen

I've created an example repo that shows how to do this; without additional poetry support. https://github.com/dermidgen/python-monorepo/ A nice example. I see you:

dermidgen commented 3 years ago

@gerbenoostra

The make is all about building wheels and hoisting them into your service package - it doesn't really care so much about version - it's more about making sure that any of your repo local shared libraries are built by poetry and brought up into your build path for the service package.

If you were to publish your shared library, you would just reference the version number and let poetry install/build it for you. When you do that, you'd be working off the published version - no longer the repo local shared library.

I hope that answers your question

NixBiks commented 2 years ago

I prototyped something based on Yarn Workspaces and @KoltesDigital above ideas as a plugin:

https://github.com/jacksmith15/poetry-workspace-plugin

Its already quite workable (you can currently install with poetry plugin add poetry-workspaces, although be aware of #4365 before installing !).

If you have any ideas about how it could be better, feel free to join the discussion.

Just what I was looking for. I think it qualifies for a feature request so poetry workspace is a thing. There are no great python monorepo tools out there - just a lot of tooling to make it work. Great work on the plug-in - looks promising!

alecandido commented 2 years ago

I'd really like with poetry to be able to build multiple packages side by side, in a layout like the following:

src/
|-- pkg1/
|-- pkg2/
|-- ...

In this case, I'm not sure how dependencies and versions should be treated. Maybe instead an already successful layout (like Cargo workspace one) would be easier and more sensible:

poetry-workspace.toml
pkg1/
|-- pyproject.toml
|-- src/
pkg2/
|-- pyproject.toml
|-- src/

Note: I chose poetry-workspace.toml as a random name for the top-level file, since the pyproject.toml file already has another meaning. But in principle, I would like this top-level config file to host a lot of the things that are now in pyproject.toml, like pytest configs, isort configs, poe (poethepoet) scripts, and so on (so maybe keeping pyproject.toml as name might be worth)

DavidVujic commented 2 years ago

Hi all!

I just released a new Poetry plugin called poetry-multiproject-plugin. The plugin will make it possible to have multiple projects in a monorepo, with their dependencies defined in separate TOML files. I think it solves some of the feature requests in this thread.

The current version (0.1.0) includes a custom command called project-build. I plan to add more features to it shortly and would really like feedback on this!

Example usage:

poetry project-build -t my-custom-project.toml

I also plan to use this plugin as a base for another plugin, that will focus on making Python Monorepos useful.

Please note that the plugin is dependent on the current preview of Poetry. In a way, it is a preview plugin for a preview Poetry 😄

Here's the repo: https://github.com/DavidVujic/poetry-multiproject-plugin#poetry-multiproject-plugin

shishkin commented 2 years ago

@DavidVujic thanks for sharing your plugin. If I'm not mistaken reading the code, the plugin allows to call poetry build with a different project file path. How is it different from calling (cd path/to/another/project && poetry build) from the shell? Maybe an example of a multi-project project would help to illustrate the benefit of the plugin.

shishkin commented 2 years ago

One use case that would benefit from multiple project definitions in s single repo would be serverless applications. E.g. AWS Lambda functions would benefit from a minimal subset of dependencies to reduce the size of the asset bundle, while additional AWS CDK infrastructure automation code might have additional dependencies. Putting them in a single repo helps to make changes atomic - changing lambda function code together with how that function is deployed. At the same time it would be beneficial to have a shared set of tooling (tests, formatting, linting etc.) and unified versions of common dependencies for the whole repo. That is what yarn workspaces do almost perfectly.

DavidVujic commented 2 years ago

Thank you for sharing your feedback @shishkin, much appreciated! I will add example usage to the project.

What differs from the poetry build command, is something I learned when I had put a pyproject.toml in a project subfolder and referencing relative packages in there.

Something like:

packages = [{ include = "../../../my-package" }]

If I'm not mistaking, Poetry doesn't allow to reference code that is outside of the project root.

That's what led me to develop the plugin: being able to have more than one project file at the workspace root and being able to build each project with the custom project file.

But I am new to Poetry and might have missed something with setting up projects in subfolders 😄

shishkin commented 2 years ago

Same here, new to Python and Poetry 😄

I haven't tried to reference a project from a parent directory. I've only tried adding a dependency from the "root" Poetry project to a "child" project like so:

[tool.poetry.dependencies]
lambda = {path = "packages/lambda", develop = true}

That seemed to work for module resolution, building and type checking from the top project.

alecandido commented 2 years ago

Actually, at the end I'm doing something like:

packages = [
  { include = "pkg1", from = "src" },
  { include = "pkg2", from = "src" },
]

The real problem is that doing this way:

  1. they are packaged in a single PyPI package
    • they don't have distinct discoverable names
    • they can't have different dependencies (partially solved with extra dependencies, but a bit annoying)
    • they can't have different versions
    • they just live together, there is no clear dependency relation explicitly expressed
  2. it lacks automation commands (commands used to perform the same action for each package, possibly even in parallel)

So, as written above, I'd like to have a strict layout like the one of cargo, that allows to package them independently, but even to be able to

Most of the latter things (all items in the list but the first) can be simply provided as plugins, but I believe that a core support for monorepos is deeply coupled with poetry (packaging and environments) and should be provided as part of the main tool.

DavidVujic commented 2 years ago

I'm sharing a link, because I think it might be relevant to the discussion here. I have developed a Poetry plugin (two actually) that enables working with Monorepos. More specifically, I am working on porting the Polylith architecture to Python. Please be aware that the plugin support in Poetry I've built this feature on top of, is in preview and not stable yet. 😄

Here's an article about the work and a short video with an example on how to use it: https://davidvujic.blogspot.com/2022/02/a-fresh-take-on-monorepos-in-python.html

gerbenoostra commented 2 years ago

I'd also like to see poetry working in a mono repo situation.

I currently only face 1 challenge: Being able to refer to other projects in the same folder structure (thus the dirty on-disk status), while also being able to release the packages as python packages (thus depending on specific versions).

In a way, a monorepo can already work:

When one has

src/
|-- pkg1/
|-- pkg2/
|-- ...

Then one can refer to pkg1 from within pkg2 using the dev dependencies:

# src/pkg2/pyproject.toml
[tool.poetry.dev-dependencies]
pkg1 = { path="../pkg1", develop=true}

However, when publishing the package, the wheel/zip of pkg2 will have no explicit dependency on pkg1. This is of course expected, because pkg1 is listed as a dev-only dependency.

In an attempt to explicitly state the dependency, I added the dependency to the production dependencies:

# src/pkg2/pyproject.toml
[tool.poetry.dependencies]
pkg1 = "^1.2.3"

[tool.poetry.dev-dependencies]
pkg1 = { path="../pkg1", develop=true}

Such an approach will add pkg1 as an explicit dependency. However, when pip installing the built wheel of pkg2, it will expect pkg1 to exist in the path ../pkg1.

(Keeping the pkg1 version in sync with pkg2 could be worked around with something like commitizen's regex.

If it would be possible to not let the path of the dev-dependencies override the final pip requirement in the wheel, it would suffice for most cases. Though I don't object to having a root workspace project as addition.

Is there a way to keep the tool.poetry.dependencies version dependency, even though one has a path specified for that dependency in the tool.poetry.dev-dependencies? Or would it be a good idea to add a configuration option in the dependency spec?

alecandido commented 2 years ago

@gerbenoostra to the best of my knowledge poetry it's missing the support for multiple PyPI (distribution) packages in the same project.

However, if it's sufficient for you, you can package in a single PyPI package multiple python packages, and this works fine https://python-poetry.org/docs/pyproject/#packages

Of course, distributing a single package, you can't have:

and so on. But in some cases is a good enough solution.

gerbenoostra commented 2 years ago

@AleCandido Multiple PyPI (distribution) packages in the same project would indeed be a way and nice addition. I however see that as an extension to having packages refer to each other. Projects need to be able to refer to each other, and on top of that, one could add a container project, that distributes all those subprojects in one go.

The downsides of distributing a single package, mainly dependencies & metadata, prevent me from moving to a monolith package. Now I resolve to independently built packages. But it comes with all the typical downsides of multi repo approach (MRs on multiple repo's, needing to update versions on each one every time)

Just having independent projects referring to each other (thus in my example, having both a src/pkg1/pyproject.toml and a src/pkg2/pyproject.toml) would already help.

gerbenoostra commented 2 years ago

I see two different aspects within the poetry monorepo:

To workaround the missing poetry functionality regarding the second point, I've created an example repo at https://gitlab.com/gerbenoostra/poetry-monorepo .

It implements two different approaches:

Any feedback and/or suggestions would be useful.