numpy / numpy

The fundamental package for scientific computing with Python.
https://numpy.org
Other
27.49k stars 9.81k forks source link

ENH: support "python setup.py build_sphinx" and creation of man files #20229

Closed kloczek closed 2 years ago

kloczek commented 2 years ago

Describe the issue:

First setuptools<>sphinx integration is blocked. Below patch fixes that

--- a/setup.py~ 2021-08-15 19:15:47.000000000 +0100
+++ b/setup.py  2021-10-27 22:08:13.533729354 +0100
@@ -264,9 +264,9 @@
     # fine as they are, but are usually used together with one of the commands
     # below and not standalone.  Hence they're not added to good_commands.
     good_commands = ('develop', 'sdist', 'build', 'build_ext', 'build_py',
-                     'build_clib', 'build_scripts', 'bdist_wheel', 'bdist_rpm',
-                     'bdist_wininst', 'bdist_msi', 'bdist_mpkg', 'build_src',
-                     'bdist_egg')
+                     'build_clib', 'build_scripts', 'build_sphinx', 'bdist_wheel',
+                     'bdist_rpm', 'bdist_wininst', 'bdist_msi', 'bdist_mpkg',
+                     'build_src', 'bdist_egg')

     for command in good_commands:
         if command in args:
@@ -328,9 +328,6 @@
               - `git clean -Xdf` (cleans all versioned files, doesn't touch
                                   files that aren't checked into the git repo)
             """,
-        build_sphinx="""
-            `setup.py build_sphinx` is not supported, use the
-            Makefile under doc/""",
         flake8="`setup.py flake8` is not supported, use flake8 standalone",
         )
     bad_commands['nosetests'] = bad_commands['test']

Reproduce the code example:

Apply above patch -> run "python setup.py build_sphinx -b man"

Error message:

With above patch sphinx fails with:

+ /usr/bin/python3 setup.py build_sphinx -b man --build-dir build/sphinx
Running from numpy source directory.
Cythonizing sources
numpy/random/_bounded_integers.pxd.in has not changed
numpy/random/_bounded_integers.pyx.in has not changed
numpy/random/_common.pyx has not changed
numpy/random/_generator.pyx has not changed
numpy/random/_mt19937.pyx has not changed
numpy/random/_pcg64.pyx has not changed
numpy/random/_philox.pyx has not changed
numpy/random/_sfc64.pyx has not changed
numpy/random/bit_generator.pyx has not changed
numpy/random/mtrand.pyx has not changed
Processing numpy/random/_bounded_integers.pyx
blas_opt_info:
blas_mkl_info:
customize UnixCCompiler
  libraries mkl_rt not found in ['/usr/local/lib64', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/']
  NOT AVAILABLE

blis_info:
  libraries blis not found in ['/usr/local/lib64', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/']
  NOT AVAILABLE

openblas_info:
C compiler: /usr/bin/gcc -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -D_GNU_SOURCE -fPIC -fwrapv -ffat-lto-objects -D_GNU_SOURCE -fPIC -fwrapv -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -D_GNU_SOURCE -fPIC -fwrapv -ffat-lto-objects -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -D_GNU_SOURCE -fPIC -fwrapv -ffat-lto-objects -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -fPIC

creating /tmp/tmpawot2x9h/tmp
creating /tmp/tmpawot2x9h/tmp/tmpawot2x9h
compile options: '-c'
gcc: /tmp/tmpawot2x9h/source.c
/usr/bin/gcc /tmp/tmpawot2x9h/tmp/tmpawot2x9h/source.o -L/usr/lib64 -lflexiblas -o /tmp/tmpawot2x9h/a.out
  FOUND:
    libraries = ['flexiblas', 'flexiblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

  FOUND:
    libraries = ['flexiblas', 'flexiblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

lapack_opt_info:
lapack_mkl_info:
  libraries mkl_rt not found in ['/usr/local/lib64', '/usr/local/lib', '/usr/lib64', '/usr/lib', '/usr/lib/']
  NOT AVAILABLE

openblas_lapack_info:
C compiler: /usr/bin/gcc -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -D_GNU_SOURCE -fPIC -fwrapv -ffat-lto-objects -D_GNU_SOURCE -fPIC -fwrapv -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -D_GNU_SOURCE -fPIC -fwrapv -ffat-lto-objects -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -D_GNU_SOURCE -fPIC -fwrapv -ffat-lto-objects -O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -fPIC

creating /tmp/tmpm9m_5kw9/tmp
creating /tmp/tmpm9m_5kw9/tmp/tmpm9m_5kw9
compile options: '-c'
gcc: /tmp/tmpm9m_5kw9/source.c
/usr/bin/gcc /tmp/tmpm9m_5kw9/tmp/tmpm9m_5kw9/source.o -L/usr/lib64 -lflexiblas -o /tmp/tmpm9m_5kw9/a.out
  FOUND:
    libraries = ['flexiblas', 'flexiblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

  FOUND:
    libraries = ['flexiblas', 'flexiblas']
    library_dirs = ['/usr/lib64']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

Warning: attempted relative import with no known parent package
/usr/lib64/python3.8/distutils/dist.py:274: UserWarning: Unknown distribution option: 'define_macros'
  warnings.warn(msg)
running build_sphinx
Running Sphinx v4.2.0

Configuration error:
There is a programmable error in your configuration file:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/sphinx/config.py", line 328, in eval_config_file
    exec(code, namespace)
  File "/home/tkloczko/rpmbuild/BUILD/numpy-1.21.3/doc/source/conf.py", line 63, in <module>
    replace_scalar_type_names()
  File "/home/tkloczko/rpmbuild/BUILD/numpy-1.21.3/doc/source/conf.py", line 55, in replace_scalar_type_names
    typ = getattr(numpy, name)
AttributeError: module 'numpy' has no attribute 'byte'

NumPy/Python version information:

1.21.3

melissawm commented 2 years ago

Hello @kloczek , is there a reason for your suggestion?

kloczek commented 2 years ago

You mean that patch? Indeed it is reason because looks like setuptools<>sphinx can be used to build documentation in choosen format and/or blocking that does not make to much sense. That message module 'numpy' has no attribute 'byte' looks like it is numpy code issue so after fixing that it should be possible to generate documentatiion.

mattip commented 2 years ago

Sorry, I do not understand. The way to generate documentation, as the error message you propose to delete states, is to

You cannot build documentation without installing numpy since the sphinx build needs docstrings from the c-extension modules that do not exist in a source checkout. The build_sphinx build target is disabled on purpose.

blocking that does not make to much sense

It does make sense in that building/installing numpy is non-trivial, and doing that on top of building documentation is too many layers.

kloczek commented 2 years ago

You cannot build documentation without installing numpy since the sphinx build needs docstrings from the c-extension modules that do not exist in a source checkout.

setuptools<>sphinx integration allows build documentation without installing module.

kloczek commented 2 years ago

Could you please at least try to test what I've proposed?

mattip commented 2 years ago

setuptools<>sphinx integration allows build documentation without installing module.

But your patch error says otherwise: typ = getattr(numpy, name) fails since there is no numpy.

The NumPy documentation uses meta programming to dig docstrings out of the installed module. If the module is not installed, the documentation will be incomplete. For instance, how can you generate the API references for nditer without installing NumPy? The text comes from doing the equivalent of help(numpy.nditer) and parsing the output. This documentation is hidden away and not available without the numpy module installed.

kloczek commented 2 years ago

Module is available after setuptools build command in build/.

Issue is that I have exatly the same error even with installed numpy.

mattip commented 2 years ago

This paragraph tries to describe the workflow. Either pip install -e . or pip install . are common patterns, especially if used under a virtualenv or conda environment.

kloczek commented 2 years ago

Quote from my python-numpy.spec file:

%build
OPENBLAS=%{_libdir} \
BLAS=%{_libdir} \
LAPACK=%{_libdir} \
%py3_build egg_info

export PYTHONPATH=$PWD/build/$(cd build; ls -d1 lib*)
%py3_build_sphinx_man

As you see before rendering man page (%py3_build_sphinx_man macro) I'm calling %py3_build egg_info which is equivalent of pip install -e .

Nevertheless even if egg_info would be not called pytjon would able to find numpy module metadata out of files installed python-numpy rpm package. As I wrote I have exactly the same error when I'm building numpy module as rppp package with already installed exactly the same python-numpy rpm package (which has disabled patt about rendering documentation as man page document),

mattip commented 2 years ago

I think you want export PYTHONPATH=$PWD(ls -dl build/testenv/lib/*/site-packages)

Could you describe the big picture of what you are trying to do and what you have already explored? What is an rppp package? What are py3_build egg_info and py3_build_sphinx_man?

If your goal is to create man pages from the numpy documentation, I think the path to having a one-step "create man pages" command is going to be quite involved for NumPy. Does creating man pages work if you use the normal workflow, before we get to using spec files?

mattip commented 2 years ago

I changed the title to better reflect what we are discussing, is the change correct?

kloczek commented 2 years ago

Yep. Now is better. So what is possible I can try to do (test/diagnose) about the subject?

kloczek commented 2 years ago

Could you describe the big picture of what you are trying to do and what you have already explored? What is an rppp package? What are py3_build egg_info and py3_build_sphinx_man?

My bad. Not rppp but rpm :) Sorry. I'm working on Linux/Solaris distro in which in all packages with python modules if in source tree is used sphinx I'm building module documentation as man page on lvl 3. Each module man page is stored udder name 'python-.3'. So with that it is possible to use for example bash ot fish shell tab completion to quickly find documentation for exact python module

[tkloczko@ss-desktop numpy-1.21.4]$ man python-<tab><tab>
Display all 103 possibilities? (y or n)
python-anyiodoc                           python-flask                              python-nbclient                           python-sniffio
python-argon2-cffi                        python-flit                               python-nbformat                           python-sortedcontainers
python-asgi                               python-future                             python-outcome                            python-sphinxcontrib-mermaid
python-async_generator                    python-gidocgen                           python-parso                              python-sphinx_rtd_theme
python-attrs                              python-html5lib                           python-path                               python-sqlparse
python-automat                            python-hyperlink                          python-pathspec                           python-terminado
python-babel                              python-hypothesis                         python-pep517                             python-testpath
python-backcall                           python-importlib_resources                python-platformdirs                       python-toolz
python-backports.entry-points-selectable  python-ipykernel                          python-pluggy                             python-tornado
python-beautifulsoup                      python-itsdangerous                       python-prompt_toolkit                     python-tox
python-black                              python-jedi                               python-psutil                             python-traitlets
python-bleach                             python-Jinja                              python-ptyprocess                         python-trio
python-build                              python-jsonschema                         python-py                                 python-trustme
python-cffi                               python-jupyter_client                     python-pyenchant                          python-twisted
python-charset-normalizer                 python-jupyter_core                       python-pygments                           python-urllib3
python-click                              python-lark                               python-pyparsing                          python-validators
python-coveragepy                         python-lxml                               python-pyrsistent                         python-virtualenv
python-cython                             python-mako                               python-pytest                             python-wcwidth
python-dateutil                           python-markupsafe                         python-pytest-cov                         python-webencodings
python-dbus                               python-mdit-py-plugins                    python-pytest-runner                      python-websocket-client
python-decorator                          python-metaextract                        python-pytest-trio                        python-werkzeug
python-deprecation                        python-mistune                            python-pyxdg                              python-wheel
python-dnspython                          python-mock                               python-requests                           python-yarl
python-entrypoints                        python-multidict                          python-semantic-version                   python-zipp
python-execnet                            python-multipledispatch                   python-setuptools                         python-zope-event
python-filelock                           python-myst-parser                        python-six

Here is the scale which I've been able so far reach:

[tkloczko@ss-desktop SPECS]$ ls -1 python-*| wc -l; grep ^%py3_build_sphinx_man python-*|wc -l
684
345

That rpm macro has very simple definition:

[tkloczko@ss-desktop numpy-1.21.4]$ rpm -E %py3_build_sphinx_man
\
        PBR_VERSION=%{version} \
        SETUPTOOLS_SCM_PRETEND_VERSION=%{version} \
        /usr/bin/python3 setup.py build_sphinx -b man --build-dir build/sphinx

Definition of the %py3_build macro is:

[tkloczko@ss-desktop numpy-1.21.4]$ rpm -E %py3_build
\

CFLAGS="-O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none";
CXXFLAGS="-O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none";
FFLAGS="-O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -I/usr/lib64/gfortran/modules";
FCFLAGS="-O2 -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fdata-sections -ffunction-sections -flto=auto -flto-partition=none -I/usr/lib64/gfortran/modules";
LDFLAGS="-Wl,-z,relro -Wl,--as-needed -Wl,--gc-sections -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -flto=auto -flto-partition=none -fuse-linker-plugin";
CC="/usr/bin/gcc"; CXX="/usr/bin/g++"; FC="/usr/bin/gfortran";
AR="/usr/bin/gcc-ar"; NM="/usr/bin/gcc-nm"; RANLIB="/usr/bin/gcc-ranlib";
export CFLAGS CXXFLAGS FFLAGS FCFLAGS LDFLAGS CC CXX FC AR NM RANLIB;
 \
        PBR_VERSION=%{version} \
        SETUPTOOLS_SCM_PRETEND_VERSION=%{version} \
        /usr/bin/python3 setup.py  build --executable="/usr/bin/python3 -s"

If your goal is to create man pages from the numpy documentation, I think the path to having a one-step "create man pages" command is going to be quite involved for NumPy. Does creating man pages work if you use the normal workflow, before we get to using spec files?

That is the issue that if setuptools<>sphinx integration (https://www.sphinx-doc.org/en/master/usage/advanced/setuptools.html) would be implemented correctly in simplest case not to much needs to be done on exact python module side source tree. In case of numpy for some (still unknown for me) reason using such integration is intentionally blocked. So far 'numpy` is only module when I've been able to find such blockage. In many cases when I've been reporting some new sphinx 4.x issues or warnings module maintainers even didn't know that such possibility exist OOTB. In most of the cases modules maintainers up to now have been using special Makefile to define rendering process in exact format (in few cases after I've pointed that such Makefile is a redundant such Makefile files have been removed because what offers setuptools<>sphinx integration is simpler.

mattip commented 2 years ago

NumPy tries to take a long-term view of making breaking changes like removing the doc building Makefile, and are slow to add new features like man page support. Once we provide a mechanism to create man pages, we would have to support the inevitable issues their formatting, content, and presentation would raise. We have a NEP process for making large changes to the project, this kind of change may need such documentation and community buy in.

I would try to divide this into parts:

Changes like this need a champion who is willing to drive the work forward: engage in discussion with the community, bring evidence that the change is desirable, break the project into consumable chunks and submit the implementation for review via PRs once there is general support for the idea. There are doc team meetings every other week, it may be helpful to engage with them.

mattip commented 2 years ago

Looking at the list of generated pages, I wonder what information is presented in the python-tornado man pages, and how it compares to the official documentation at https://www.tornadoweb.org/en/stable/. How would one dive into the subpages, like the one for tornado.auth? Is your distro providing built-in support for links? If not, what added value is provided by having sphinx generate the page?

If the goal is to provide a one page summary of numpy with no links, perhaps we could provide such a man page preformatted, with no need to run sphinx, manually, that basically would point to other places to read documentation.

kloczek commented 2 years ago

Looking at the list of generated pages, I wonder what information is presented in the python-tornado man pages

Here is attachment with output of the man python-tornado>python-tornado.txt python-tornado.txt

kloczek commented 2 years ago

Just in case .. amongst those ~350 packages which I have now with python modules with sphinx documentationm only probably two or three modules are rendering more than one man page (copy.py man_pages =[] entry allows define multiple man pages as the output).

kloczek commented 2 years ago
  • If we do decide to support man pages, then we need to decide whether to add the setup.py support you are asking for. This would have a few parts:

You don't need to support exactly man page output. All what is needed is correctly implemented setuptools<>sphinx integration (let's call it passive sphinx support). Special support is necessary only in case when module provides sphinx extension which could be used on rendering other modules documentation. As far as I remember numpy does not offer such sphinx extension as kind of active sphinx support. In other words .. not to much needs to be provided from point of view of numpy side.

  • engaging in a discussion of supporting new targets in setup.py. The python community in general would like to move away from distutils and setup.py. Perhaps investing effort into migrating to a new build system would be more fruitful, in light of the next point

If you would decide to to switch in future to for example to pure pep517/build, poetry or flit instead setuptools all what needs to be left is minimum setup.py file with content like:

from setuptools import setup
setup()
  • converting the Makefile, with its specialized targets and workflows, into a form usable by the different consumers of the current Makefile: regular users, CI tasks, gitpod integration, downstream packagers, ....

All documentation workflows is possible to define in copy.py file. Sphinx supports interactions with tools like doxygen, md/rst parsers, plan txt imput files and markups added in comments in source code files. In other words if someone has impression that some Makefile is necessary to define multistage documentation rendering processes he/she is 100% right .. it is only impression.

  • trivially add support for python setup.py build_sphinx once all the other parts are in place.

In most trivial case only content of the setup.py which needs to be present in module source tree is

from setuptools import setup
setup()

With such content by default setuptools<>sphinx integration is trying to use docs/copy.py file to pass it to executed sphinx-build command.

kloczek commented 2 years ago

If you would decide to to switch in future to for example to pure pep517/build, poetry or flit instead setuptools

Just in case if someone would be asking. At the moment only setuptools provides integration with sphinx. I've personally been asking maintainers of pep517, build, poetry and flit maintainers (over publicly accessible issue tickets) are they going to implement similar integration like setuptools has and generally they've refused to take care of that part of typical build operations. From that point of view I would recombed to stick with setuptools and only transform from current setup.py to setup.cfg/project.toml files (but there is no rush as transformation).

mattip commented 2 years ago

I still think creating a numpy-sanctioned man page needs more community buy-in than a two-person discussion.

Thanks for the tornado example. There is an open issue about creating such a single-page NumPy reference. Personally, I don't get it, but if someone feels passionately that this is what they need, and is willing to put in the work to make it happen, then maybe we should support it. If that happens we could make that one-pager into a (huge) man page. In the mean time, if your disto package workflow requires a man page for every package, I would suggest a static placeholder that defers to numpy.org, no sphinx or building needed.

As for moving away from the Makefile: perhaps sphinx + conf.py can do all that the Makefile does. It would require someone to submit PRs to fold those make targets back into sphinx. Then the Makefile would become an empty shell of lines like the qthelp target, although proper attention would need to be given to dependencies on the generate and version-check targets:

qthelp: generate version-check
    $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) build/qthelp $(FILES)

As a bystander to all the churn around build tools, I have to say I agree with the tool maintainers. Building documentation seems to be a separate, independent task from building and installing a package. Often building the package is a prerequisite for building the documentation. I would think your package spec could just as easily invoke sphinx-build as it could setup.py build_sphinx

kloczek commented 2 years ago

Claryfication: I'm not intersted content of the docuementation. I'm only interested imrpove procedure how existing docuemntaion could/can be rendered. As well I'm leaving you decision about generate one page or multiple pages :)

Outside of pute python area builing documentation if that is available is fully integrated in case tooling lik GNU autotools, meson, cmake, scon and many more. In that case setuptool is 100% analogue in case of python modules build/maintainece frameworks.

kloczek commented 2 years ago

So what do you think about that AttributeError: module 'numpy' has no attribute 'byte' error?

mattip commented 2 years ago

That error is not the problem. The problem is the request to support a man page format for the numpy documentation. Unless we can discuss the utility of having numpy documentation in that format, I find it hard to support any effort to enable its creation.

kloczek commented 2 years ago

As I wrote roff format used by man ciommand is still most common documentation format on all Unix systems.

[tkloczko@ss-desktop SPECS]$ grep ^%py3_build_sphinx_man python-*.spec | wc -l; ls -1 python-*.spec|wc -l
367
711

As you see I was able to use that approach in more than half of my all packaged python modules (all other do not have documentation at all or it is just some residual dox in some test files) and only 2 or 3 habve some issues (mostly by incorrwect sphinx use). It would be really nice to have possibility to generate numpy documentation as man page .. please.

melissawm commented 2 years ago

@kloczek as @mattip suggested, maybe it would be nice to write up a proposal for this and send it to the mailing list, to maybe involve more people in the conversation. I would suggest getting clarification on the format of the man page you are expecting and the changes that would need to happen for this to work, and then it might be possible to consider a concrete proposal.

kloczek commented 2 years ago

Hmm .. why not :) Can you point on example of such proposal?

melissawm commented 2 years ago

Sure! Maybe look at https://mail.python.org/archives/list/numpy-discussion@python.org/

One example: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/EVQW2PO64464JEN3RQXSCDP32RQDIQFW/

kloczek commented 2 years ago

Small proposal in form of below patch:

--- a//setup.cfg~       2021-12-20 01:19:55.000000000 +0000
+++ b//setup.cfg        2022-01-01 05:41:31.975688765 +0000
@@ -12,3 +12,6 @@
 versionfile_build = numpy/_version.py
 tag_prefix = v
 parentdir_prefix = numpy-
+
+[build_sphinx]
+source-dir = doc/source

Without that sphinx<>setuptools integration by default tries to render documentation from doc/neps/ instead doc/source/.

rgommers commented 2 years ago

Small proposal in form of below patch:

Does that actually get you a useful man page? So well-formatted page with understandable and useful content in it? What does it look like? Can you add the content here if it's short? Maybe a description or a screenshot otherwise?

kloczek commented 2 years ago

Does that actually get you a useful man page? So well-formatted page with understandable and useful content in it? What does it look like? Can you add the content here if it's short? Maybe a description or a screenshot otherwise?

In case of man page for numpy module I cannot answet on that question ansd show you any sample of that man page because (if you will look on top of this ticket) still it is not possible to generate documentation in such format. This is why I've created that ticket with subject sphinx fails (which has been renamed but not by me).

Nevertheless setuptools<>sphinx integration by default uses docs/ directory and if the location of the conf.py file is different that should be changed by add section which you can see in proposed patch. https://www.sphinx-doc.org/en/master/usage/advanced/setuptools.html

rgommers commented 2 years ago

In case of man page for numpy module I cannot answet on that question ansd show you any sample of that man page because (if you will look on top of this ticket) still it is not possible to generate documentation in such format.

Well, then your proposal 3 comments up isn't an actual proposal, right? It's a 2 line patch that you say does not work yet ....

You've had very detailed responses from @mattip and @melissawm. The answer is clear for now: we do not support man pages until there's a demonstrated need and a clear proposal to add support. That work is, unfortunately, on you to do. None of us have a need for man pages, nor have we ever seen users use man pages for other similar Python packages. To us, man pages are for Unix-style CLI tools, not for Python projects.

If it's really a 2 line patch and then all is good, I think we're happy to just merge it and move on. But more comments are not really going to convince us to do that work right now, it's up to someone who cares about this.

kloczek commented 2 years ago

None of us have a need for man pages, nor have we ever seen users use man pages for other similar Python packages. To us, man pages are for Unix-style CLI tools, not for Python projects.

Please choose random man page from list which I've dropped in https://github.com/numpy/numpy/issues/20229#issuecomment-977695959 and I'll attach or c&p here (if that document will be short).

rgommers commented 2 years ago

I don't see any scientific libraries; closest are ipykernel and multipledispatch perhaps.

kloczek commented 2 years ago
PYTHON-MULTIPLEDISPATCH(3)                                                 Multiple Dispatch                                                 PYTHON-MULTIPLEDISPATCH(3)

NAME
       python-multipledispatch - Multiple Dispatch Python Module Documentation

       Contents:

DESIGN
   Types
          signature  :: [type]
                        a list of types

          Dispatcher :: {signature: function}
                        A mapping of type signatures to function implementations

          namespace  :: {str: Dispatcher}
                        A mapping from function names, like 'add', to Dispatchers

   Dispatchers
       A  Dispatcher  object  stores and selects between different implementations of the same abstract operation. It selects the appropriate implementation based on a
       signature, or list of types. We build one dispatcher per abstract operation.

          f = Dispatcher('f')

       At the lowest level we build normal Python functions and then add them to the Dispatcher.

          >>> def inc(x):
          ...     return x + 1

          >>> def dec(x):
          ...     return x - 1

          >>> f.add((int,), inc)    # f increments integers
          >>> f.add((float,), dec)  # f decrements floats

          >>> f(1)
          2

          >>> f(1.0)
          0.0

       Internally Dispatcher.dispatch selects the function implementation.

          >>> f.dispatch(int)
          <function __main__.inc>

          >>> f.dispatch(float)
          <function __main__.dec>

       For notational convenience dispatchers leverage Python's decorator syntax to register functions as we define them.

          f = Dispatcher('f')

          @f.register(int)
          def inc(x):
              return x + 1

          @f.register(float)
          def dec(x):
              return x - 1

       This is equivalent to the form above.  It adheres to the standard implemented by functools.singledispatch in Python 3.4 (although the "functional form" of  reg‐
       ister is not supported).

       As in singledispatch, the register decorator returns the undecorated function, which enables decorator stacking.

          @f.register(str)
          @f.register(tuple)
          def rev(x):
              return x[::-1]

       The Dispatcher creates a detailed docstring automatically.  To add a description of the multimethod itself, provide it when creating the Dispatcher.

          >>> f = Dispatcher('f', doc="Do something to the argument")

          >>> @f.register(int)
          ... def inc(x):
          ...     "Integers are incremented"
          ...     return x + 1

          >>> @f.register(float)
          ... def dec(x):
          ...     "Floats are decremented"
          ...     return x - 1

          >>> @f.register(str)
          ... @f.register(tuple)
          ... def rev(x):
          ...     # no docstring
          ...     return x[::-1]

          >>> print(f.__doc__)
          Multiply dispatched method: f

          Do something to the argument

          Inputs: <float>
          ----------------
          Floats are decremented

          Inputs: <int>
          --------------
          Integers are incremented

          Other signatures:
              str
              tuple

   Namespaces and dispatch
       The dispatch decorator hides the creation and manipulation of Dispatcher objects from the user.

          # f = Dispatcher('f')  # no need to create Dispatcher ahead of time

          @dispatch(int)
          def f(x):
              return x + 1

          @dispatch(float)
          def f(x):
              return x - 1

       The dispatch decorator uses the name of the function to select the appropriate Dispatcher object to which it adds the new signature/function. When it encounters
       a new function name it creates a new Dispatcher object and stores name/Dispatcher pair in a namespace for future reference.

          # This creates and stores a new Dispatcher('g')
          # namespace['g'] = Dispatcher('g')
          # namespace['g'].add((int,), g)
          @dispatch(int)
          def g(x):
              return x ** 2

       We store this new Dispatcher in a namespace. A namespace is simply a dictionary that maps function names like 'g' to dispatcher objects like Dispatcher('g').

       By default dispatch uses the global namespace in multipledispatch.core.global_namespace. If several projects use this global namespace unwisely  then  conflicts
       may arise, causing difficult to track down bugs. Users who desire additional security may establish their own namespaces simply by creating a dictionary.

          my_namespace = dict()

          @dispatch(int, namespace=my_namespace)
          def f(x):
              return x + 1

       To establish a namespace for an entire project we suggest the use of functools.partial to bind the new namespace to the dispatch decorator.

          from multipledispatch import dispatch
          from functools import partial

          my_namespace = dict()
          dispatch = partial(dispatch, namespace=my_namespace)

          @dispatch(int)  # Uses my_namespace rather than the global namespace
          def f(x):
              return x + 1

METHOD RESOLUTION
       Multiple dispatch selects the function from the types of the inputs.

          @dispatch(int)
          def f(x):           # increment integers
              return x + 1

          @dispatch(float)
          def f(x):           # decrement floats
              return x - 1

          >>> f(1)            # 1 is an int, so increment
          2
          >>> f(1.0)          # 1.0 is a float, so decrement
          0.0

   Union Types
       Similarly to the builtin isinstance operation you specify multiple valid types with a tuple.

          @dispatch((list, tuple))
          def f(x):
              """ Apply ``f`` to each element in a list or tuple """
              return [f(y) for y in x]

          >>> f([1, 2, 3])
          [2, 3, 4]

          >>> f((1, 2, 3))
          [2, 3, 4]

   Abstract Types
       You can also use abstract classes like Iterable and Number in place of union types like (list, tuple) or (int, float).

          from collections import Iterable

          # @dispatch((list, tuple))
          @dispatch(Iterable)
          def f(x):
              """ Apply ``f`` to each element in an Iterable """
              return [f(y) for y in x]

   Selecting Specific Implementations
       If multiple valid implementations exist then we use the most specific one. In the following example we build a function to flatten nested iterables.

          @dispatch(Iterable)
          def flatten(L):
              return sum([flatten(x) for x in L], [])

          @dispatch(object)
          def flatten(x):
              return [x]

          >>> flatten([1, 2, 3])
          [1, 2, 3]

          >>> flatten([1, [2], 3])
          [1, 2, 3]

          >>> flatten([1, 2, (3, 4), [[5]], [(6, 7), (8, 9)]])
          [1, 2, 3, 4, 5, 6, 7, 8, 9]

       Because strings are iterable they too will be flattened

          >>> flatten([1, 'hello', 3])
          [1, 'h', 'e', 'l', 'l', 'o', 3]

       We avoid this by specializing flatten to str. Because str is more specific than Iterable this function takes precedence for strings.

          @dispatch(str)
          def flatten(s):
              return s

          >>> flatten([1, 'hello', 3])
          [1, 'hello', 3]

       The multipledispatch project depends on Python's issubclass mechanism to determine which types are more specific than others.

   Multiple Inputs
       All of these rules apply when we introduce multiple inputs.

          @dispatch(object, object)
          def f(x, y):
              return x + y

          @dispatch(object, float)
          def f(x, y):
              """ Square the right hand side if it is a float """
              return x + y**2

          >>> f(1, 10)
          11

          >>> f(1.0, 10.0)
          101.0

   Variadic Dispatch
       multipledispatch supports variadic dispatch (including support for union types) as the last set of arguments passed into the function.

       Variadic signatures are specified with a single-element list containing the type of the arguments the function takes.

       For example, here's a function that takes a float followed by any number (including 0) of either int or str:

          @dispatch(float, [(int, str)])
          def float_then_int_or_str(x, *args):
              return x + sum(map(int, args))

          >>> f(1.0, '2', '3', 4)
          10.0

          >>> f(2.0, '4', 6, 8)
          20.0

   Ambiguities
       However ambiguities arise when different implementations of a function are equally valid

          @dispatch(float, object)
          def f(x, y):
              """ Square left hand side if it is a float """
              return x**2 + y

          >>> f(2.0, 10.0)
          ?

       Which result do we expect, 2.0**2 + 10.0 or 2.0 + 10.0**2? The types of the inputs satisfy three different implementations, two of which have equal validity

          input types:    float, float
          Option 1:       object, object
          Option 2:       object, float
          Option 3:       float, object

       Option 1 is strictly less specific than either options 2 or 3 so we discard it. Options 2 and 3 however are equally specific and so it is unclear which to use.

       To resolve issues like this multipledispatch inspects the type signatures given to it and searches for ambiguities. It then raises a warning like the following:

          multipledispatch/dispatcher.py:74: AmbiguityWarning:
          Ambiguities exist in dispatched function f

          The following signatures may result in ambiguous behavior:
              [object, float], [float, object]

          Consider making the following additions:

          @dispatch(float, float)
          def f(...)

       This  warning  occurs  when  you  write  the function and guides you to create an implementation to break the ambiguity. In this case, a function with signature
       (float, float) is more specific than either options 2 or 3 and so resolves the issue. To avoid this warning you should implement this new  function  before  the
       others.

          @dispatch(float, float)
          def f(x, y):
              ...

          @dispatch(float, object)
          def f(x, y):
              ...

          @dispatch(object, float)
          def f(x, y):
              ...

       If  you do not resolve ambiguities by creating more specific functions then one of the competing functions will be selected pseudo-randomly.  By default the se‐
       lection is dependent on hash, so it will be consistent during the interpreter session, but it might change from session to session.

AUTHOR
       Matthew Rocklin

COPYRIGHT
       2014, Matthew Rocklin

0.4.0                                                                         Jan 06, 2022                                                   PYTHON-MULTIPLEDISPATCH(3)

Ipykernel man page does not contains to much probably because what I already reported in https://github.com/ipython/ipykernel/issues/732

PYTHON-IPYKERNEL(3)                                                          IPython Kernel                                                         PYTHON-IPYKERNEL(3)

NAME
       python-ipykernel - IPython Kernel Python Module Documentation

       This contains minimal version-sensitive documentation for the IPython kernel package.  Most IPython kernel documentation is in the IPython documentation.

       Contents:

       • genindex

       • modindex

       • search

AUTHOR
       IPython Development Team

COPYRIGHT
       2015, IPython Development Team

6.6                                                                           Dec 01, 2021                                                          PYTHON-IPYKERNEL(3)
kloczek commented 2 years ago

Ipykernel man page does not contains to much probably because what I already reported in ipython/ipykernel#732

Nope .was wrong. Even online documentation has only changelog. https://ipykernel.readthedocs.io/en/stable/

rgommers commented 2 years ago

Okay, so multipledispatch man page is just the html docs (without Table of Contents, see https://multiple-dispatch.readthedocs.io/en/latest/index.html) as plain text, all concatenated. The NumPy html docs are 1000+ pages, so that's not going to make much sense as a man page imho.

mattip commented 2 years ago

Above there is a link to man python-tornado>python-tornado.txt, which is a single 10,000 line document. Is there a discussion within your distro team about the need for python man page packages?

kloczek commented 2 years ago

Above there is a link to man python-tornado>python-tornado.txt, which is a single 10,000 line document

It is not longest man page. IIRC botocore man page has +40k lines. Still with search provided by typical pager like less or more it doesn't matter how long that page is :)

Is there a discussion within your distro team about the need for python man page packages?

Yes and people are finding that as useful on using it with tab completion and apropos/man -k). PS. Quoting from gawk info pages "Documentation is like sex. When it is good. it is is really, really good. When it is bad it is better than nothing".

rgommers commented 2 years ago

Yes and people are finding that as useful on using it with tab completion and apropos/man -k).

Okay, if it's useful and easy to support for us, then I'm happy to review a PR that adds support.

mattip commented 2 years ago

This is not a case of "when it is bad it is better than nothing" since there are at least 3 other alternatives: HTML help, PDF help, and the internal python help system (there are more). If it is not clear from my previous comments, I think providing man pages for python packages is a waste of developer time, packaging resources, and will confuse and frustrate users. I have not heard anything yet to convince me otherwise except for @kloczek 's comment that "Yes and people are finding that as useful", which I would be happy to read more about. I would be interested in seeing the use case where someone rejects using the internal python help system in favor of opening a separate terminal and doing man python-xxx. If they are already opening a terminal to get man help, they could just as easily open a browser and get the well-formatted HTML help. Maybe I am missing something, please help me see the light.

easy to support for us

A decision to support a new documentation format should not be just a "why not". We will have to maintain a new format, and support various rendering routines. While very minimal, the cost will not be zero going forward, and once we open the flood gates and give this an official OK, other projects will have to follow suite. I would like to see a public discussion on the mailing list at least, if not a full NEP before a PR. This is what @melissawm asked for a month ago, and has still not been done.

kloczek commented 2 years ago

A decision to support a new documentation format should not be just a

To be honest numpy is ONLY which I know which has some limitation about rendered format. Just counted and looks like sphinx supports 17 output formats. Do you really want to receive proposal to enable new format when someone would want use it? And/or do you rally want to tell end users which documentation formats they can use? If yes .. that is a bit odd.

Nevertheless in this case issue is that for some unknown reasons you are blocking intentionally use setuptools<>sphinx integration so it is not possible to render that way ANY type of the documentation. Using that integration is really handy because it allows build module documentation using exactly the same command no matter where conf.py and documentation source tree are. Maintaining some Makefile file you are wasting own time. Blocking setuptools<>sphinx integration you are wasting other people time on figure out how in this particular case generate documentation. IMO it would be really way better if you would just spend that time on proper pep517 build support. Please ..

Nevertheless sphinx still is falling .. if it is really so easy to fix that please do that. For me fixing that is not that easy :)

eli-schwartz commented 2 years ago

To us, man pages are for Unix-style CLI tools, not for Python projects.

To be fair, man pages also include:

And other slightly more exotic sections. So it is not just about CLI programs.

That being said, I will have to agree that just defining a target isn't hugely helpful unless it's been verified that the resulting manpage actually looks good and provides a good user experience.

I do NOT agree that "bad documentation is better than nothing". The alternative isn't nothing -- the alternative is using the HTML docs in a browser, or the python help() system. The python help() system especially is quite good at providing you the documentation you want.

As demonstrated by projects that actually do ship section 3 man pages for C libraries, one would customarily have one man page per function, which makes things a lot more discoverable.

The first HTML page I looked at in the numpy API documentation used mathjax in multiple locations, this should be investigated to see if it translates well to the manpage renderer. (I don't... generally... see images in man pages, you know... doubtless this is possible somehow.)

kloczek commented 2 years ago

And other slightly more exotic sections. So it is not just about CLI programs.

That is true:

[tkloczko@ss-desktop Packages]$ rpm -qpl gnome-* | grep /usr/share/man | wc -l
46
[tkloczko@ss-desktop Packages]$ rpm -qpl kf5-* | grep /usr/share/man | wc -l
149

None of those man pages have anything to do with CLI.

As demonstrated by projects that actually do ship section 3 man pages for C libraries, one would customarily have one man page per function, which makes things a lot more discoverable.


MAN(1)                                                                     Manual pager utils                                                                    MAN(1)

NAME man - an interface to the system reference manuals

[..]

   The table below shows the section numbers of the manual followed by the types of pages they contain.

   1   Executable programs or shell commands
   2   System calls (functions provided by the kernel)
   3   Library calls (functions within program libraries)
   4   Special files (usually found in /dev)
   5   File formats and conventions, e.g. /etc/passwd
   6   Games
   7   Miscellaneous (including macro packages and conventions), e.g. man(7), groff(7)
   8   System administration commands (usually only for root)
   9   Kernel routines [Non standard]

As you see section 3 has nothing to do with C.
Traditionally that section is for developers documentation. Pasted `multipledispatch` module is 100% abut that topic.
On python module is equivalent of the library.

Nevertheless classification of exact module documentation as man page in exact section is 100% off topic.
That classification can be done if actual man page is possible have .. which is still not the case in case of `numpy`.

In mean time did someone had a least quick look to check why sphinx shows that error and fails?
Sorry for that question .. I have only impression that if all of our time in this conversation would be spent on fixing the issue it would be already solved :/
eli-schwartz commented 2 years ago

I don't say that section 3 is about C library functions. I said that it is about "library functions, which can be traditionally modeled by the many C libraries which install their library documentation there".

seberg commented 2 years ago

I don't have a strong opinion, my gut feeling is that this hones very few users (my debian does not ship man-files, editors make help available), so if we are expected to fix it, or make sure that man-pages are not malformed (if they currently are), then that is a prices I am not sure we should have to pay. (I would like easier offline doc availability sometimes, but I was always thinking of finding the html docs easier ;))

rgommers commented 2 years ago

so if we are expected to fix it, or make sure that man-pages are not malformed (if they currently are), then that is a prices I am not sure we should have to pay.

Agreed, what I had in mind is treating it similar to niche hardware/OSes. We don't (can't) test and don't prioritize maintenance work ourselves, but if someone who cares submits a reasonable patch, we just merge it.

InessaPawson commented 2 years ago

@kloczek Thank you so much for sharing your thoughts on how we can improve NumPy for Linux users. Please realize that your request is very unique. My search of the entire history of issues and PRs on numpy/numpy and the NumPy mailing list showed only two items that are somewhat related to man page support. This is the closest to yours: #11052

Given the size of our user base (10+ million), I hope you can agree that focusing on such requests would not be a wise use of time of our maintainers, most of whom are volunteers.

pradyunsg commented 2 years ago

Hi folks!

I reckon this can be closed -- Sphinx has deprecated the build_sphinx hook (https://github.com/sphinx-doc/sphinx/pull/10040) and setuptools has effectively deprecated the use of setup.py directly: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html. I don't think it's a good idea to add functionality to something that is deprecated by the tooling this is building upon.

mattip commented 2 years ago

Thanks @pradyunsg. Closing.