pantsbuild / pants

The Pants Build System
https://www.pantsbuild.org
Apache License 2.0
3.26k stars 626 forks source link

Support for Python Doctesting #11227

Open lindt opened 3 years ago

lindt commented 3 years ago

would by awesome to support python3 -m doctesting python modules.

Eric-Arellano commented 3 years ago

Hey Lindt, this would be cool. Could you please share more about how you're achieving this at the moment without Pants? For example, are you installing a certain library? What command are you running?

lindt commented 3 years ago

i'm just running python3 -m doctest for each source module

so for example

python3 -m doctest src/foo.py python3 -m doctest src/bar.py

But u also can chain them

python3 -m doctest src/foo.py src/bar.py

Eric-Arellano commented 3 years ago

It looks like Pants supports this already via https://docs.pytest.org/en/stable/doctest.html#doctest-integration-for-modules-and-test-files.

Either run ./pants test path/to/test.py -- --doctest-modules, or permanently set this in your pants.toml:

[pytest]
args = ["--doctest-modules"]

But, I don't expect this to actually work how you want, as I imagine you're specifying the doctests in production code (a python_library target), not test code (a python_tests target). So, Pants won't tell Pytest to run over src/foo.py.

There's one workaround, have a python_library target like normal, but also python_tests describing the same source file. Something like this:

# src/BUILD

# This uses default `sources=['*.py', '!*_test.py', '!test_*.py']
python_library()

python_tests(name="doctests", sources=['foo.py'])

Now, you can run ./pants test src/foo.py -- --doctest-modules, or just ./pants test src/foo.py if you set up pants.toml. But there is one major downside: if you are using dependency inference, then Pants will no longer infer any imports of the module foo, as Pants can't determine if you want to depend on the python_library or the python_tests target - you'd have to explicitly add the dependency to the dependencies field.

We've hit this problem before with the pex_binary target and are in the process of fixing it. We need to do some design work for how this feature should be modeled in BUILD files.

--

Let us know if you get the above workaround working. We'll try thinking about how to better model this. Good timing to ask about this while we're thinking about pex_binary!

seifertm commented 3 years ago

Thanks for describing your idea in so much detail! I was looking for the same functionality as the OP and I thought I'd share my experiences. Referencing the production source code with a python_tests target and enabling --doctest-modules works as expected.

Besides the issue with the dependency inference you mentioned, there is one more thing to consider: Pants runs pytest for each Python module. If one of the modules in the doctests target do not have any doctests, pytest will exit with error code 5, meaning that no tests were collected.

One way around this is to exclude all modules from the doctest target, which don't have any doctests. Alternatively, the Pytest docs mention a pytest plugin that allows customizing the return code.

benjyw commented 3 years ago

Thanks for the useful info @seifertm ! Glad the workarounds work, although it would be even better if we had first-class support for running doctest directly on source files without the hackery. If anyone reading this wants to add that support we'd be happy to provide guidance. I don't think it'd be too hard to add it as a "linter", for example.

Eric-Arellano commented 2 years ago

We have had a big conceptual change with pants 2.8 distinguishing between atom targets and generator target: See "File-level metadata with overrides" in https://blog.pantsbuild.org/introducing-pants-2-8/

That makes me wonder if we should implement like this?

# Every file in dir must have doctests
python_sources(doctests=True)

# Or, get specific
python_sources(
  overrides={
     ("strutil.py", "docutil.py"): {"doctests": True},
  },
)

I think that I prefer that over having a dedicated target for doctest, which results in many issues like having to duplicate the dependencies and interpreter_constraints field.

jgarte commented 1 year ago

Hi,

Is there any updates on this feature? I'm very much interested in it! Is there any way I can help?

benjyw commented 1 year ago

Hi! There has been no progress, but since doctest is built in to the stdlib, I have a feeling this would be really straightforward to implement as a linter. If you're interested, we can help you along!

jgarte commented 1 year ago

Sure, where should I start? Look at how other linters are implemented?

benjyw commented 1 year ago

Yeah, there are so many Python linters, looking at their implementation will give you a sense of the commonality, plus there is this guide: https://www.pantsbuild.org/docs/plugins-lint-goal

Note that your case will be simpler than those other linters, because you can entirely ignore the "how to install the linter tool" part, as doctest is built in!

cognifloyd commented 1 year ago

doctest as a linter? It's very much a test thing, not a lint thing. I think running it with linters would be surprising.

So, maybe a new doctest backend that adds the doctest field to python_sources. Then it can use a FieldSet to select them when the test goal runs. We can have parallel linters, and each language can have its own test framework, so why not multiple test systems in a given language?

benjyw commented 1 year ago

I'm not familiar enough with doctest to say, but if it makes more sense to have it run under test then yeah, that sounds right.

kaos commented 1 year ago

Yeah, I agree with @cognifloyd here (as the name hints at, it runs tests) as doctest is a test framework where you write your test cases in docstrings directly in your sources. https://docs.python.org/3/library/doctest.html

Eric-Arellano commented 1 year ago

Agreed on it being in test, and I still like my proposal to set doctests=True on python_source and python_sources targets; although there are some complexities like if we want extra_test_env_vars to be on python_source now.

@jgarte thanks for looking into this! Probably the most helpful thing is to get a firm understanding of how to run doctest without Pants. What command would you invoke on the CLI, for example? Are you using the standard library, vs a library like Pytest? Once that is figured out, it becomes fairly mechanical to port to Pants and we can help give tips on how to do this.

jgarte commented 1 year ago

What command would you invoke on the CLI, for example?

python3 -m doctest -f -v foo.py

Are you using the standard library, vs a library like Pytest?

I am interested in running doctest on production code in a monorepo.

Once that is figured out, it becomes fairly mechanical to port to Pants and we can help give tips on how to do this.

@Eric-Arellano Knowing the above what do you recommend I do next? Do you have an example I can use as a starting template for implementing this?

benjyw commented 1 year ago

At a high level, I think we're looking at:

  1. Set up plugin boilerplate (see https://www.pantsbuild.org/docs/plugins-overview and @AlexTereshenkov's upcoming tutorial at https://github.com/pantsbuild/pants/pull/17732)
  2. Look at the implementation of the pytest runner to get an idea of how to plug in to the generic test goal (basically by returning a TestResult for a batch of files)
  3. Create a plugin rule that takes a batch of PythonSource and returns a TestResult (creating all the ancillary types as needed). At first that TestResult can be a dummy. We can plug in the actual doctest run later. The harder part, in this case, is getting all the plugin mechanisms to line up.
kaos commented 1 year ago

I think this could initially just be an option on the python_test target, as pytest has support for running doctests (I've not tried it, though, so unsure how it would integrate with regular tests..)

 --doctest-modules     run doctests in all .py modules
stuhood commented 1 year ago

I think this could initially just be an option on the python_test target, as pytest has support for running doctests (I've not tried it, though, so unsure how it would integrate with regular tests..)

 --doctest-modules     run doctests in all .py modules

Not unless pytest was told which transitive dependency modules of the actual test to test: otherwise, each pytest process would re-run the transitive doctests.