doxygen / doxygen

Official doxygen git repository
https://www.doxygen.org
GNU General Public License v2.0
5.59k stars 1.27k forks source link

Python code appears as namespace #9250

Open couet opened 2 years ago

couet commented 2 years ago

In my hierarchy of files I have the following python script:

## \file
## \ingroup tutorial_graphs
## \notebook -js
## Bent error bars. Inspired from work of Olivier Couet.
##
## \macro_image
## \macro_code
##
## \author Alberto Ferro

import ROOT

c1 = ROOT.TCanvas()
n = 10
x = ROOT.std.vector('double')()
for i in [-0.22, 0.05, 0.25, 0.35, 0.5, 0.61,0.7,0.85,0.89,0.95]: x.push_back(i)
y = ROOT.std.vector('double')()
for i in [1,2.9,5.6,7.4,9,9.6,8.7,6.3,4.5,1]: y.push_back(i)
exl = ROOT.std.vector('double')()
for i in [.05,.1,.07,.07,.04,.05,.06,.07,.08,.05]: exl.push_back(i)
eyl = ROOT.std.vector('double')()
for i in [.8,.7,.6,.5,.4,.4,.5,.6,.7,.8]: eyl.push_back(i)
exh = ROOT.std.vector('double')()
for i in [.02,.08,.05,.05,.03,.03,.04,.05,.06,.03]: exh.push_back(i)
eyh = ROOT.std.vector('double')()
for i in [.6,.5,.4,.3,.2,.2,.3,.4,.5,.6]: eyh.push_back(i)
exld = ROOT.std.vector('double')()
for i in [.0,.0,.0,.0,.0,.0,.0,.0,.0,.0]: exld.push_back(i)
eyld = ROOT.std.vector('double')()
for i in [.0,.0,.05,.0,.0,.0,.0,.0,.0,.0]: eyld.push_back(i)
exhd = ROOT.std.vector('double')()
for i in [.0,.0,.0,.0,.0,.0,.0,.0,.0,.0]: exhd.push_back(i)
eyhd = ROOT.std.vector('double')()
for i in [.0,.0,.0,.0,.0,.0,.0,.0,.05,.0]: eyhd.push_back(i)

gr = ROOT.TGraphBentErrors(
   n,x.data(),y.data(),exl.data(),exh.data(),eyl.data(),eyh.data(),exld.data(),exhd.data(),eyld.data(),eyhd.data())

gr.SetTitle("TGraphBentErrors Example")
gr.SetMarkerColor(4)
gr.SetMarkerStyle(21)
gr.Draw("ALP")

and it appears as namespace:

Screenshot 2022-04-04 at 17 29 16

I am wondering why ?

doxygen version: 1.9.4 (91095fb62f0bc2c624b0e8e49d9316f51be76559) Doxyfile: Doxyfile.gz

albert-github commented 2 years ago

Which name would you like to see here? Packages? Have a go with the setting: OPTIMIZE_OUTPUT_JAVA

ferdymercury commented 2 years ago

I think the goal is to prevent Python scripts from showing up at all in this list. It clutters the list of C++ namespaces with several tutorial scripts in Python that are not relevant for the core C++ namespace documentation. https://root.cern/doc/master/namespaces.html

albert-github commented 2 years ago

This is a small extra constraint (the mix of C++ and Python. I expected it but left it out to see whether we had here a separate project or the same main) the reaction). Mixing the C++ and Python code will always result in a wrong name here, for C++ one wants to have "Namespace" but for Python (probably) "Packages" but doxygen can only show one name. Most likely a solution might be to create a separate doxygen build for the Python part and use its tag file for references to the original code (or vice versa).

ferdymercury commented 2 years ago

Thanks for the reply! I see. Maybe for the future, it might be interesting to have a config-flag to split Namespace and Packages, or alternatively a flag to activate the exclusion of Python files from namespace list.

couet commented 2 years ago

C++ and python are mixed a lot in ROOT. That will be difficult to have two different passes. The thing I do not understand is why the ROOT python tutorials are assumed to be namespaces whereas the keyword "namespace" does not appear in those scripts, (unlike C++ where there is this keyword) ? Moreover these tutorials all have the \file directive at the beginning. So they can just be consider as "files", nothing more. I do not understand how "namespace" enter the game for those files.

albert-github commented 2 years ago

The thing I do not understand is why the ROOT python tutorials are assumed to be namespaces whereas the keyword "namespace" does not appear in those scripts,

This has been a design decision of the original people implementing the original python scan engines in doxygen. Problem is that doxygen is mostly designed for C / C++ like languages and Python has just a bit another philosophy.

couet commented 1 year ago

python_produces_namespaces.tar.gz This tarfile reproduces this issue. untar it and type "make" in the untared folder. Then in the same folder open html/index.html. Then on the opened page navigate from files up to twoscales.py. You will see the namespace. That's really annoying because in addition to being local nonsense information, the list of real namespaces is polluted by these fake "Python namespaces".

pv33 commented 9 months ago

Workaround

First time posting. Not sure if this is proper.

As noted above, it is fundamentally a design issue. Those bound to doxygen due to coding across several languages will have to make do.

I too did not like this outcome and could not identify a way to fix it except as below, which might not be appealing. Create two doxygen configuration scripts and build out two parallel web pages.

My setup:

doxygen configuration file for packagename and for testing are customized to provide what works. The core API doxygen pages are not cluttered with script namespaces, files, bogus classes, etc. The testing doxygen files are, but that option in the doxygen configuration is disabled. The packagename/testing doxygen configuration is highly tailored to not output much. All of the tests are grouped so that their description is recovered through the group hierarchy pages. The file structure of the testing scripts also mirrors the path structure of the package code so that access through Files follows a logical hierarchy roughly mirrored by the group page but in a more compact form.

The above is the cleanest possible setup discovered so far. The main code doxygen pages link to the testing/example doxygen site for rapid onboarding and understanding of how the code works and of its operational design.

Another trick I got from working out how to use doxygen for Matlab was to add a custom @quitf command and have doxygen pre-processors ignore everything after it. At the end of the documentation block in scripts, a @quitf will prevent doxygen from seeing the actual code. That prevents the code from showing up in the doxygen as weird things. Leaves clean documentation for these files. No doubt there is a doxygen configuration that might do the same, but it was too much trouble to discover.

couet commented 7 months ago

Thanks for your input. The first trick will not work as the C++ and Python codes are separated. The 2nd trick might be more feasible, but I am not sure what the "custom @quitf command" would be... I tried to use @cond and @endcond to disable the oxygen interpretation of the code. but that does not work.

The @cond doc tells "If the section label is omitted, the section will be excluded from processing unconditionally."

ferdymercury commented 7 months ago

Hi couet, i think he means about modifying our doxygen C++ filter. If a line starts with quitf (or if a file ends with .py?), then stop processing. And then handle the py files in a separate project where quitf is ignored or sth like that.

pv33 commented 7 months ago

Yes, there needs to be a filter to suppress the extraneous processing. In this case it is the nuclear option of simply preventing doxygen from ever processing the code in the file. The doxygen parser see the header only.

My code and documentation is a work in progress as I piece out exactly how to manipulate doxygen to get the right outcomes. Here is link to my current doxygen build configuration and preprocessors so you can see the implementation. One build configuration is for main code and one for test/development scripts. There is a simple perl script that does some pre-processing (parsequit.pl invoked by ivapyfilter set as a pre-processor in doxygen build file). Focus on the quitf part not the classf part in ther perl script (that is leftover from the Matlab perl pre-processor to assist with package namespaces by prepending them to the class name).

It then builds out the following code site and testing site. Again, they are very bare bones due to lack of time, but the former has proper documentation for the classes while the latter has cleaner documentation for the test scripts than would happen naturally. What is missing is to add cross-linking between the two sites using doxygen pages for smoother navigation. It takes some figuring out to know how to document the scripts so they are properly parsed.

The code populating doxygen comes from the repos at that git organization.

If it helps to ping me for private discussion, go for it.

pv33 commented 7 months ago

The doxygen pre-processor output is what doxygen processes. The perl script is set to repeat every line until a quitf, at which point it outputs EOF. doxygen never sees the code that follows in the original file.

Maybe there is confusion about what is being referenced (I am not a doxygen expert, but just bumble along). It looks like doxygen documentation refers to something else as "preprocessor." To be explicit, I use the INPUT_FILTER option (description here).

The INPUT_FILTER documentation does say to not add or remove lines otherwise the anchors get messed up. But, if the impact is to remove all lines after a certain point, then there never will be anchors made for that code anyhow, and the code up to the quitting points will have correct anchors since no new text was added there.

edit: Looks like FILTER_PATTERNS would work for applying only to specific file extensions.

ferdymercury commented 7 months ago

Good point! Something like:

FILTER_PATTERNS *.cpp=my_cpp_filter *.py=my_python_filter

couet commented 6 months ago

So the idea would to to have several filters instead of one. To be tried.

couet commented 6 months ago

The fact the Python examples appear in the namespaces is not connected to the filter. I removed completely the filter from Doxyfile (FILTER_PATTERNS is empty) and still the Python example appears in the namespaces. So using FILTER_PATTERNS will not help as even without any filter the Python code goes into the namespaces.

pv33 commented 6 months ago

The original issue covered many problems with documentation generation. Starting with the first one, which is that scripts were being dumped into weird locations within the HTML hierarchy. The filter recommended should remove those files. Without knowing exactly what was done, not much help can be provided.

Moving to actual code files that are part of a python package API, getting them to appear properly requires finessing the doxygen configuration and possibly using the proper subset of tags within the python code comments. Those will require playing around with the settings. Since the intent is something specialized to your desires and source, it might be difficult to find someone who has resolved the exact same issue.

In the end, there will not be one single change that will correct all of the problems noted at the start of the issue. They'll have to be taken care of one by one until getting to the best point possible. Most likely there will be at least one configuration setting that works great but might lead to some negative consequence with the C++ code. Then, it will be a choice between the least bad.

Even I struggle to know exactly what you want. Best to sit down and specify completely, then resolve one by one.

couet commented 6 months ago

Thanks for your reply. I will try to follow your advice. Indeed there is one single issue: some Python code appears in C++ namespaces.

ferdymercury commented 6 months ago

Even I struggle to know exactly what you want. Best to sit down and specify completely, then resolve one by one.

I have prepared a minimal reproducer: mini_name.zip

You will see:

image

We would like to see:

image

@albert-github Do you think that it could be doable to have Doxyfile option where "namespaces" are grouped/sorted by language in the left sidebar toc tree?

albert-github commented 6 months ago

When we look at the files part and would distribute the files over some sub-directories we can already see:

image

though the namespaces are of course a bit different, though it could be done already (but probably a bit undesirable as it will confuse the user) by replacing ROOT by Cpp::Root giving:

image

for the python part I didn't see such a quick / undesirable possibility.

So probably one would have to do the splitting base on the "file type", and split the namespaces in different "folders" in the generated tree.

I think this would be possible, though the big question (@doxygen) is of course is it desirable (with an extra setting it would be by a bit easier acceptable, although again an extra setting ...).

couet commented 6 months ago

The fix for this issue is here: https://github.com/root-project/root/pull/14883 It does a post processing of the html files to remove the python tutorials from NamSpaces.