AnswerDotAI / nbdev

Create delightful software with Jupyter Notebooks
https://nbdev.fast.ai/
Apache License 2.0
4.94k stars 492 forks source link

[Feature Requests] Better integration with standard tooling #1466

Open jlopezpena opened 1 month ago

jlopezpena commented 1 month ago

TL/DR: I'd like to make some changes to nbdev to integrate it with other tooling.

Been using extensively nbdev over the last couple of weeks, and have identified a few quirks that bug me. This issue is to collect them so that they don't get lost. If the devs are open to these features, I could even contribute some PRs myself.

Long story short, I have been trying hard to integrate nbdev with other tools that enable what are considered "best practices" in the community, namely I have been working on some repos using

First hurdle I experienced was the duplication of settings data. Modern tooling relies on pyproject.toml for their settings, whereas nbdev uses settings.ini instead. Most of the nbdev settings get loaded in the setup.py file, which can be used together with pyproject.toml by defining the build-system as setuptools, but I'd rather use more modern alternatives like hatchling.

So, I tried moving as many settings as possible from settings.ini to pyproject.toml, and (much to my surprise) I found I could safely move almost all of them and still get the core functionality of nbdev working. Besides allowing for using modern tooling, this has the additional benefit of bootstrapping nbdev for new projects: with this setup nbdev can be declared to be a dev dependency instead of requiring to be installed at some outside place. This allows to run everything from a virtual environment created by uv. The thing that one would "lose" would be the capability of creating a new project with nbdev_init (but that can be sorted by using a package template to either clone or use with uv init).

The remining minimal set of options I need left in settings.ini to use nbdev is so barebones that I think there would be no big loss of functionality if all those settings were to be loaded from a [tools.nbdev] section in pyproject.toml.

I have a very early stages simple project template showcasing how I got my setup to work (still very much WIP as I keep finding quirks when using that setup with real projects).

So, here is a list of changes I'd like to see (and can potentially help to implement if core devs agree with the vision):

  1. Allow nbdev to read settings from pyproject.toml and delegate the dependency management to other tools
  2. Modify the nbdev_export command so that:
    • It uses absolute exports instead of relative exports -- absolute imports are preferred in some setups for a variety of reasons, but my main pain point here is that changing absolute imports from my notebooks to relative ones in my library annoys ruff as it can change the import ordering
    • It can use ruff format (or any other formatter) to prettify the generated source (instead of black)
  3. Longer term - Allow for more flexible exports than just the library setup currently in effect. Eg being able to export to different "subpackages" for monorepo stuctures, or to be able to export a bunc of management scripts that are not part of the library code.

Does this resonate with the core devs vision for the project and the community needs? Any feedback either here or in my template much appreciated!

jvivian commented 4 weeks ago

Thank you @jlopezpena for writing up such a detailed proposal. As a user, these are all changes I want as well.

stantonius commented 2 weeks ago

Regarding use of pyproject.toml, I spotted this on a related Jeremy et al project that was created a few months ago (so this is a fairly current opinion by the nbdev team).

The nbdev project spent around a year trying to move to pyproject.toml but there was insufficient functionality in the toml-based approach to complete the transition.

jlopezpena commented 2 weeks ago

Interesting. The only reference I could find in nbdev is this comment, but there are no specifics as to what functionality is missing.

I have been using nbdev almost daily for a month on a project that is completely defined using pyproject.toml without any issues. Apart from toml reading support not being available in the standard library prior for python <= 3.11 and the lack of a standard toml writing method, which I can see could hinder the use of nbdev_new. But as I mentioned I specifically don't want to use nbdev as a tool for creating new projects because that requires having a system-level nbdev and prevents bootstrapping nbdev itself as a dev dependency of the project, so I'd rather create my projects with uv init or poetry init. All the remaining functionality (nbdev_test, export, update, building docs, etc.) seems to be unaffected.

@jph00 I understand if you don't feel like rehashing an old debate and your vision for nbdev is different, but otherwise any feedback would be appreciated.

jph00 commented 2 weeks ago

Thanks @jlopezpena -- anything to increase compatibility with modern python tooling would be welcome, as long as it doesn't add significant new deps or remove compatibility with classic python tooling and can support at least py38 ootb.

It sounds from your experience like that might now be possible, so I'm happy to re-open that discussion.

The challenge I'm not sure about is how to deal with the lack of toml support in the stdlib. How do we ensure nbdev works in CI and nbdev_new continues to work on py38? There is a pypi toml lib, although it only lists support up to py39, which if accurate means py310 would be an issue afaict.

I definitely do not want to require folks to use uv or poetry BTW, although I'm happy if we can provide tools that also help those users.

jlopezpena commented 2 weeks ago

Thanks for you response! The way I made my changes no additional dependencies would be required. toml support was included in the standard library in python 3.11, so the functionality would be natively available there. For older python, support for pyproject.toml would be added as an optional functionality; enabling this functionality would require having an additional dependency on tomli, which supports python >= 3.8 (this is the module that got integrated in the standard library in python 3.11). People wanting to use nbdev alongisde pyproject.toml in python < 3.11. would need to install an extra, nbdev[toml] which would pull that tomli dependency (which is only 16KB and has no further dependencies). But I should stress that all of this would be optional.

Even for newer python, my intention was not to replace the current system using settings.ini, but rather to offer an alternative for people that prefer using pyproject.toml, or have to for other reasons (in my case: data science projects included in folders in mono-repos that are reliant on poetry/uv, ruff, pytest and other tooling).

The way I'd do this would be by creating a subclass of fastcore's Config object, with essentially the same functionality but reading the data from pyproject.toml. When running nbdev, I'd check for the existence of a [tools.nbdev] section in pyproject.toml, if that section is there I would use the new config class, and if it is not (or the file pyproject.toml does not exist, or tomli is not installed) then nbdev would keep using the current one relying on settings.ini.

For people using pyproject.toml, it is up to them to decide how they manage the file and dependencies. One could keep using setuptools with pyproject, or can use poetry, or pdm or whatever, that would be entirely up to the user, and no build system would ever be forced upon anyone.

For the sake of simplicity, I'd implement the subclassing of Config directly in nbdev as a proof-of-concept. If it works as expected and it is considered useful then it could be migrated to fastcore.

jlopezpena commented 2 weeks ago

Regarding nbdev_new it would keep working as it is now. Using that requires having nbdev installed at a system level (or via a conda environment), and people using that setup probably wouldn't want to define a virtual environment in their project anyway

jph00 commented 2 weeks ago

OK that all sounds pretty good! I'd be happy to look at PRs to add this. Ideally, can we please aim to minimise the amount of code, that's written for this, and particularly to minimise the amount of code in each PR (i.e if it's reasonably convenient to break it into smaller pieces, please do so). I'm not smart enough to understand large chunks of code -- although notebooks that lay things out in small bits with plenty of examples certainly helps!

Message ID: @.***>

cdtr1 commented 1 week ago

Hello and thanks for considering better uv support for nbdev.

I am new to nbdev and really feel that it is very underrated also for single person teams to make massive progress fast.

On thing that still bugs me is the use of pip install -e . to load the package in editable mode. I installed the package with uv pip install -e . which means it is not considered in pyproject.toml - this way nbdev_export seems to work.

I can also use uv to add editable modules but this does not seem to be picked up by nbdev properly.

(venv) ➜  nbproject git:(main) ✗ uv add --editable nbproject
error: Requirement name `nbproject` matches project name `nbproject`, but self-dependencies are not permitted without the `--dev` or `--optional` flags. If your project name (`nbproject`) is shadowing that of a third-party dependency, consider renaming the project.

so currently i am running

uv add --editable --dev nbproject

but am not 100% sure this is actually working with nbdev_export which is why additionally use uv pip install -e .

the perfect way to do things and would appreciate it a lot if someone with more fundamental understand of both uv and nbdev can take a look.

I also have to add a license to pyproject.toml for nbdev_export to work: license = {text = "none"}.