landam / grass-gis-git-migration-test

0 stars 0 forks source link

make documentation be full text searchable: use sphinx #4

Open landam opened 5 years ago

landam commented 5 years ago

Reported by timmie on 29 Apr 2008 20:44 UTC The current HTML documentation consists of different HTML formated man pages linked together which offers good help for the experienced user. But an advantage would be to have a full text search on the documentation:

Use case: A user wants to remove a mapset or georeference a file but tdoesn't know which commands to use.

Good example for a full text searchable documentation: http://docs.python.org/dev/

Migrated-From: https://trac.osgeo.org/grass/ticket/151

landam commented 5 years ago

Comment by epatton on 30 Apr 2008 14:14 UTC I notice on http://grass.itc.it/gdp/general.php there is a link to 'Manual Pages search engine', but when clicked, the linked page displays only Google search engines for the user and developer mailing lists, the osgeo.org site, but nothing exclusive for Grass man pages.

Did this functionality once exist on this page but has been modified? How hard would it be to add in a man page search engine on http://grass.itc.it/searchgrass.php ?

~ Eric.

landam commented 5 years ago

Comment by neteler on 30 Apr 2008 14:29 UTC This could be easily realized if "htdig" was installed on grass.osgeo.org. Personally, I don't have the resources currently to set it up (no time). Once it is there, we can fix this issue in a few minutes (it used to work on grass.itc.it).

Markus

landam commented 5 years ago

Comment by timmie on 15 Jun 2009 22:27 UTC Please check Sphinx: http://sphinx.pocoo.org/ It has a standalone JavaScript based search engine.

Very good!

landam commented 5 years ago

Modified by hamish on 9 Aug 2009 07:14 UTC

landam commented 5 years ago

Modified by neteler on 9 Aug 2009 07:44 UTC

landam commented 5 years ago

Comment by neteler on 9 Aug 2009 08:12 UTC I have locally converted '''most''' pages (using html2rest.py by Gerard Flanagan at http://bazaar.launchpad.net/~grflanagan/python-rattlebag/trunk/annotate/head:/src/html2rest.py ), a set of the GRASS HTML files fails with problems like

reST markup error:
/home/neteler/grass65/dist.x86_64-unknown-linux-gnu/docs/html/rst/source/r.coin.rst:66: (SEVERE/4) Title level inconsistent:

:
:
make: *** [html] Error 1

or

reST markup error:
/home/neteler/grass65/dist.x86_64-unknown-linux-gnu/docs/html/rst/source/r.cost.rst:183: (SEVERE/4) Title level inconsistent:

Algorithm notes

make: *** [html] Error 1


This indicates to some extent HTML errors in the original as well as Sphinx problems with the tags
...
... ``` So with some effort the HTML pages could be made Sphinx compliant (perfect power user job). Here the list of failing HTML files in 6.5.svn: d.graph.rst, d.his.rst, d.linegraph.rst, d.mapgraph.rst, d.menu.rst, d.out.file.rst, d.text.freetype.rst, g.gisenv.rst, g.message.rst, grass6.rst, g.region.rst, i.ortho.photo.rst, m.proj.rst, ps.map.rst, r.category.rst, r.coin.rst, r.cost.rst, r.distance.rst, r.in.gdal.rst, r.in.xyz.rst, r.mfilter.fp.rst, r.mfilter.rst, r.out.gdal.rst, r.proj.rst, r.ros.rst, r.spreadpath.rst, r.spread.rst, r.terraflow.rst, r.tileset.rst, r.watershed.rst, r.what.rst, v.label.rst, v.lidar.correction.rst, v.lidar.edgedetection.rst, v.lidar.growing.rst, v.outlier.rst, v.reclass.rst, v.segment.rst, v.surf.bspline.rst. I've put everything online give you an impression (yes, partially messy but not so bad...): http://grass.osgeo.org/grass65/manuals/sphinx/ Markus
landam commented 5 years ago

Comment by neteler on 9 Aug 2009 08:24 UTC Here the procedure:

cd dist.x86_64-unknown-linux-gnu/docs/html/

# convert HTML to rEST:
mkdir rst
cd rst
for i in ../*.html ; do echo "$i:"; html2rest.py < $i > `basename $i .html`.rst ; done

sphinx-quickstart

# to avoid name conflict or define better in sphinx-quickstart:
mv index.rst oldindex.rst
mv *.rst source/

# convert with sphinx
make html

The resulting Sphinx-HTML manual is stored in the build/ directory.

Markus

PS: once the Wiki is back this should go there

landam commented 5 years ago

Comment by hamish on 9 Aug 2009 09:20 UTC Replying to [comment:6 neteler]: ...

This indicates to some extent HTML errors in the original as well as Sphinx problems with the tags

<dt> ...
<dd> ...

So with some effort the HTML pages could be made Sphinx compliant (perfect power user job).

Here the list of failing HTML files in 6.5.svn: d.graph.rst, d.his.rst, d.linegraph.rst, d.mapgraph.rst, d.menu.rst, d.out.file.rst, d.text.freetype.rst, g.gisenv.rst, g.message.rst, grass6.rst, g.region.rst, i.ortho.photo.rst, m.proj.rst, ps.map.rst, r.category.rst, r.coin.rst, r.cost.rst, r.distance.rst, r.in.gdal.rst, r.in.xyz.rst, r.mfilter.fp.rst, r.mfilter.rst, r.out.gdal.rst, r.proj.rst, r.ros.rst, r.spreadpath.rst, r.spread.rst, r.terraflow.rst, r.tileset.rst, r.watershed.rst, r.what.rst, v.label.rst, v.lidar.correction.rst, v.lidar.edgedetection.rst, v.lidar.growing.rst, v.outlier.rst, v.reclass.rst, v.segment.rst, v.surf.bspline.rst.

all of the above should (now) be html bug-free, as checked by dillo's lint verifier.

if that is so and all is valid HTML, remaining problems should are for the sphinx people to fix IMO.

I've put everything online give you an impression (yes, partially messy but not so bad...): http://grass.osgeo.org/grass65/manuals/sphinx/

specifically, bolds and newlines need work.

Hamish

ps- reST is good stuff.

landam commented 5 years ago

Comment by neteler on 9 Aug 2009 09:49 UTC If HTML is bugfree then it depends on http://bazaar.launchpad.net/~grflanagan/python-rattlebag/trunk/annotate/head:/src/html2rest.py which perhaps needs some tweaks to write clean reST.

landam commented 5 years ago

Comment by hamish on 10 Aug 2009 04:01 UTC FWIW, reStructuredText (reST) docs: http://docutils.sourceforge.net/rst.html

landam commented 5 years ago

Comment by neteler on 2 Jun 2010 20:13 UTC Came across another HTML to reST (Sphinx) converter:

http://johnmacfarlane.net/pandoc/

Online try (throw in GRASS HTML file): http://johnmacfarlane.net/pandoc/try

landam commented 5 years ago

Comment by @landam on 20 Jan 2011 11:18 UTC Please follow wiki page http://grass.osgeo.org/wiki/Man_Pages_Improvement_Sprint

landam commented 5 years ago

Comment by hamish on 3 May 2011 20:34 UTC Hi,

after extensive use of reST + sphinx for the osgeo LiveDVD* (live.osgeo.org) documentation and website over the last year+, I am now of the opinion that GRASS's current html-source man pages are far superior to what would be accomplished by reST-source man pages; both in terms of expressibility and aggravation. The critical thing is to get it into a stable mark-up language, once there there's little reason (besides the usual bugs) why html2rest or some any2pdf style program couldn't translate between them and make a search index. Maybe wikimedia is a bit easier markup language than html, but if you are reading this you are highly likely to be smart enough to learn that means bold and we don't actually do much complicated with it. I think we forget how simple stock HTML really is, and that when it comes to documentation, the steak is much more important than the sizzle.

[*] https://trac.osgeo.org/osgeo/browser/livedvd/gisvm/trunk/doc/

best, Hamish

landam commented 5 years ago

Comment by hamish on 3 May 2011 20:40 UTC i.e. to say, I'd rather invest the time in helping to debug htDig.

landam commented 5 years ago

Comment by neteler on 21 May 2011 15:35 UTC New URL: https://bitbucket.org/djerdo/musette/src/tip/musette/html/html2rest.py

landam commented 5 years ago

Comment by lucadelu on 19 Jun 2012 16:21 UTC Some improvements, I obtain a working version of documentation with sphinx. I really like it but there are some think to fix. Here the procedure (use a recent version of pandoc, older it's buggy for me):

cd dist.x86_64-unknown-linux-gnu/docs/
mkdir rst
# convert html to rst
for i in `ls ../html/*.html`; do pandoc -s -c ../html/grassdocs.css -r html $i -w rst -o `basename $i .html`.rst; done
# move other files
cp ../html/*.png ../html/*.jpg .
cp ../html/grassdocs.css .
cp ../html/grass_logo.txt .
cp -rf ../html/icons/ .
# start sphinx
sphinx-quickstart
# move all to source directory
mv *.rst *.png *.jpg icons/ grass* source/
# create html documentation
make html

In the next weeks I'll try to study a little bit of sphinx to fix some problems

landam commented 5 years ago

Comment by lucadelu on 13 Aug 2012 13:32 UTC In https://trac.osgeo.org/grass/changeset/52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the http://johnmacfarlane.net/pandoc/ pandoc software. You can simple run

 run make restdocs
 cd dist.XXXX/doc/rest
 make html

to create the documentation in rest format and to convert to beautiful HTML using sphinx. There are some issues still open, in order of importance level (if someone with good skill in makefile system wants to help me it would be really appreciated):

  • launching only "make", the reStructuredText documentation should not be created but some documents are created;
  • I cannot convert helptext.html and wxgui documentation due to some Make problems;
  • There are some documents with bad indentation because "pandoc" wrongs to convert
    tag, the solution should be: remove white space if second character is not another white space, but some problem could remain ;
  • Some other problems remain (special chars, formatting) in the new rest pages.

Once solved, the resulting HTML pages could replace the current manual pages (since also search is provided).

landam commented 5 years ago

Comment by hamish on 13 Aug 2012 21:34 UTC Replying to [comment:17 lucadelu]:

Once solved, the resulting HTML pages could replace the current manual pages (since also search is provided).

erhm, once solved and building in parallel ''discussion'' on if that should happen could begin. Personally I am not in favour of throwing away all the strongly marked up work we have done in the description.html files in favour of the rather erratic and obscure markup of reSt for those pages. Perhaps 'finicky' is a better word. I'm happy to see the help pages get pretty, and yes reSt+sphinx-alikes is very pretty, but would like it to be in parallel, and reSt translated from our existing HTML docs automatically (ie the description.html parts), in the same way (or better) than the man pages are now.

I don't think lack of a working htDig install*, or reliance on "site:" google search, is a fatal blow for html.

[*] (is that still the case? if so as offered earlier, I'm happy to spend a little time on it)

thanks, Hamish

landam commented 5 years ago

Comment by hellik on 13 Aug 2012 21:41 UTC Replying to [comment:17 lucadelu]:

In https://trac.osgeo.org/grass/changeset/52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the http://johnmacfarlane.net/pandoc/ pandoc software.

does this mean that pandoc would be another extern dependecy to get the docs?

on windows there would be needed an extra step installing pandoc (http://johnmacfarlane.net/pandoc/installing.html).

Helmut

landam commented 5 years ago

Comment by hamish on 13 Aug 2012 21:43 UTC [slight follow up]

Hamish wrote:

, but would like it to be in parallel, and reSt translated from our existing HTML docs automatically

I am glad to see that is indeed the case, but
->
and
->

in all the html files?! if the converter is broken, fix the converter! "
" is not a tough one to parse..

thanks, Hamish

landam commented 5 years ago

Comment by lucadelu on 13 Aug 2012 22:06 UTC Replying to [comment:18 hamish]:

Replying to [comment:17 lucadelu]:

Once solved, the resulting HTML pages could replace the current manual pages (since also search is provided).

erhm, once solved and building in parallel ''discussion'' on if that should happen could begin. Personally I am not in favour of throwing away all the strongly marked up work we have done in the description.html files in favour of the rather erratic and obscure markup of reSt for those pages. Perhaps 'finicky' is a better word. I'm happy to see the help pages get pretty, and yes reSt+sphinx-alikes is very pretty, but would like it to be in parallel, and reSt translated from our existing HTML docs automatically (ie the description.html parts), in the same way (or better) than the man pages are now.

yes no problem for me to keep both versions

thanks, Hamish

best Luca

landam commented 5 years ago

Comment by lucadelu on 13 Aug 2012 22:14 UTC Replying to [comment:19 hellik]:

Replying to [comment:17 lucadelu]:

In https://trac.osgeo.org/grass/changeset/52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the http://johnmacfarlane.net/pandoc/ pandoc software.

does this mean that pandoc would be another extern dependecy to get the docs?

so right now I only test on Linux, if pandoc it missing return an error but it is not reported at the end of make process. For the future I hope to fix compilation issue and run compile restructured text only with make restdocs and not like now only with make. If someone can help in Make configuration it's really appreciated.

Could you test compilation on windows please?

Helmut

best Luca

landam commented 5 years ago

Comment by neteler on 13 Aug 2012 22:38 UTC Replying to [comment:18 hamish]: ...

I don't think lack of a working htDig install*, or reliance on "site:" google search, is a fatal blow for html.

[*] (is that still the case? if so as offered earlier, I'm happy to spend a little time on it)

htDig is missing for many years now (unfortunately) and google search is poor (unfortunately). A better solution is definitely needed and sphinx seems to provide it as it does for many OSGeo projects.

landam commented 5 years ago

Comment by neteler on 13 Aug 2012 22:39 UTC Replying to [comment:20 hamish]:

I am glad to see that is indeed the case, but
->
and
->

in all the html files?! if the converter is broken, fix the converter! "
" is not a tough one to parse..

While I agree, we have many places where
is probably abused, i.e. in

  • lists and other odd places. pandoc is basically complaining about these suboptimal use cases.

  • landam commented 5 years ago

    Comment by glynn on 14 Aug 2012 12:09 UTC Replying to [comment:17 lucadelu]:

    In https://trac.osgeo.org/grass/changeset/52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the http://johnmacfarlane.net/pandoc/ pandoc software. I don't see what problem this is trying to solve.

    landam commented 5 years ago

    Comment by neteler on 14 Aug 2012 13:31 UTC Replying to [comment:25 glynn]:

    Replying to [comment:17 lucadelu]:

    In https://trac.osgeo.org/grass/changeset/52658 I added a first implementation of reStructuredText documentation for grass7. It uses the --rest-description flag and the http://johnmacfarlane.net/pandoc/ pandoc software. I don't see what problem this is trying to solve.

    Besides a more modern look, it offers an included search engine for the manual which even works offline in local GRASS GIS installations.

    landam commented 5 years ago

    Comment by glynn on 15 Aug 2012 20:09 UTC Replying to [comment:26 neteler]:

    I don't see what problem this is trying to solve.

    Besides a more modern look, it offers an included search engine for the manual which even works offline in local GRASS GIS installations.

    First, bear in mind that an important function of the HTML files is as the source for Unix (nroff) manual pages. Anything which interferes with that isn't acceptable.

    Beyond that, I don't really see the point of adding another dependency. Or ReST, for that matter. The output from --html-description is only a fragment of the final HTML; the rest is in HTML, and that isn't going to change (HTML is far better known and supported than ReST).

    If you think that there are specific problems with the current HTML, the appropriate solution would be to change the parser-generated HTML and/or the guidelines for the manually-generated HTML.

    landam commented 5 years ago

    Comment by hamish on 16 Aug 2012 02:22 UTC Replying to [comment:22]: @Luca: sorry I'm not much of a Makefile expert, but does the command exiting with an error not break out of the 'make' job right away? A "make restdocs" would be nice.

    Replying to [comment:23 neteler]:

    Replying to [comment:18 hamish]: ...

    I don't think lack of a working htDig install*, or reliance on "site:" google search, is a fatal blow for html.

    [*] (is that still the case? if so as offered earlier, I'm happy to spend a little time on it)

    htDig is missing for many years now (unfortunately) and google search is poor (unfortunately). A better solution is definitely needed

    It seems like a problem solved over and over again in the mid 90s (which really shows in htDig's cosmetics). There must be a better local site search package available.... we could pour hours of time into getting htDig working but at the end of the day it's still htDig, which I'm not sure of others' impressions of but I never really found too visually pleasing. Ideally there would be some tool which we could configure to also search the grass5 docs etc, but move those results all the way down to the end of page 17, with the grass 6.4 hits returning first.

    and sphinx seems to provide it as it does for many OSGeo projects.

    I'd enjoy seeing sphinx in parallel with the html docs and available, they look great, but they do take up more space and subtle things like two spaces in front of ".." instead of three, or not enough whitespace around "*" bullet points can cause your next paragraph to silently not display, with no error logged in the build messages (something I was fighting with two days ago, after a work-day of fighting with whitespace in fortran77 code). I just think we should be careful with the word "replace" the html docs at this point. As mentioned earlier, my other concern is to throw away all the strong markup and hand crafting (including
    s) that has gone into the html description.htmls, as ReSt's markup is by design much looser and sensitive. To keep (valid!) html as the source for those and converting to ReSt automatically with panodoc (IIUC how this is intended to work) would be great. The more the merrier. And pandoc -> LaTeX -> a better PDF booklet while we're at it.

    Replying to [comment:24 neteler]:

    While I agree, we have many places where
    is probably abused, i.e. in

  • lists and other odd places. pandoc is basically complaining about these suboptimal use cases.

  • do the pages pass proper html validation checks? I typically set my GRASS_HTML_BROWSER to dillo with the htmlbug validation tool turned on to test as I work.

    Or is pandoc not fully supporting valid html &/or brittle in how it does? if so, perhaps a 'sed -e 's+
    +
    +g' pre-processing step (etc) piped in as passing the files to pandoc would work around that deficiency in pandoc, until such time as pandoc is fixed.

    Replying to [comment:27 glynn]:

    Beyond that, I don't really see the point of adding another dependency.

    we have an optional make pdfdocs, why not an optional make restdocs too and host them somewhere? If it all works well & is self contained we can look at bundling the same with the binary installers, e.g. as with the new grass-dev-doc package for debian which ships the programmers' manual.

    Again, we should be careful with our use of the word "replace"; perhaps "augment" the current offerings is a better term for now? Local off-line search of the help pages is a nice goal, e.g. for use from a laptop in the field. (perhaps there is some python-html grepping library we could use?)

    shrug, Hamish

    landam commented 5 years ago

    Comment by neteler on 16 Aug 2012 08:21 UTC Replying to [comment:28 hamish]:

    Replying to [comment:23 neteler]: ... It seems like a problem solved over and over again in the mid 90s (which really shows in htDig's cosmetics). There must be a better local site search package available.

    Maybe, but I spent too much lifetime on this already.

    ...

    and sphinx seems to provide it as it does for many OSGeo projects.

    I'd enjoy seeing sphinx in parallel with the html docs and available, they look great,

    There seems to be a misunderstanding. The proposal is to keep the current HTML docs since the new sphinx mechanism uses them as input.

    The point is to offer the resulting HTML pages on the server as well as to the user in local, no problems to have two offerings here ("classical" HTML pages and the "new" ones).

    As mentioned earlier, my other concern is to throw away all the strong markup and hand crafting (including
    s) that has gone into the html description.htmls,

    Sure, nobody said this. I just pointed out that there are HTML errors in the current HTML pages which need to be fixed anyway (and which will make pandoc more happy). I am surprised that some of these pass the W3 validator.

    ...

    And pandoc -> LaTeX -> a better PDF booklet while we're at it.

    Yes if they don't become a 1000 pages manual which then will be printed and kill the rain forest.

    Replying to [comment:24 neteler]:

    While I agree, we have many places where
    is probably abused, i.e. in

  • lists and other odd places. pandoc is basically complaining about these suboptimal use cases.

  • do the pages pass proper html validation checks?

    Strangely yes. See for example https://trac.osgeo.org/grass/changeset/52667 for changes which improve the current HTML and which help pandoc as well.

    Or is pandoc not fully supporting valid html &/or brittle in how it does? if so, perhaps a 'sed -e 's+
    +
    +g' pre-processing step (etc) piped in as passing the files to pandoc would work around that deficiency in pandoc, until such time as pandoc is fixed.

    ... your suggestion needs to be tested.

    Local off-line search of the help pages is a nice goal, e.g. for use from a laptop in the field.

    Also: not all people in the world are always online... a searchable user manual is a must have nowadays. Especially when offering 400 modules.

    Markus

    landam commented 5 years ago

    Comment by glynn on 17 Aug 2012 22:55 UTC Replying to [comment:29 neteler]:

    There seems to be a misunderstanding. The proposal is to keep the current HTML docs since the new sphinx mechanism uses them as input.

    In which case, what is the point of the --rest-description option? Also, Rest.make, restdir target, etc? IOW, why does the ReST generation require anything other than the generated HTML files in dist./docs/html/*.html?

    landam commented 5 years ago

    Comment by wenzeslaus on 19 Aug 2012 08:39 UTC Replying to [comment:30 glynn]:

    In which case, what is the point of the --rest-description option? Also, Rest.make, restdir target, etc? IOW, why does the ReST generation require anything other than the generated HTML files in dist./docs/html/*.html?

    The generated HTML files (in dist./docs/html/*.html) are not enough because for example parameter list is represented as HTML description but ReST has its own representation of the parameter list (http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#option-lists ReST option list). Once module description is converted to HTML the information whether this description list is module parameter list or some general list in hand-written module description (module.html file) is lost.

    The conversion of complete generated HTML files (in dist./docs/html/*.html) is possible but there are only two options. The first one is the usage of a generic converter (as is now used for module.html files) but any clever standard formatting in ReST cannot be used. The second one is to create a custom (context aware) transformation which uses both HTML markup and contents (e.g. contents of

    tag) but this can be a lot of work (I've tried it using XSLT but I gave it up). Another option would be to use XSLT and generated XML description but direct generation of ReST description seems like a less complex solution for me.

    landam commented 5 years ago

    Comment by glynn on 19 Aug 2012 20:41 UTC Replying to [comment:31 wenzeslaus]:

    Once module description is converted to HTML the information whether this description list is module parameter list or some general list in hand-written module description (module.html file) is lost.

    The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

    The second one is to create a custom (context aware) transformation which uses both HTML markup and contents (e.g. contents of

    tag)

    This is what I'm proposing.

    but this can be a lot of work (I've tried it using XSLT but I gave it up).

    IMHO, it's preferable to cluttering up the build system with ReST-specific features. The nroff manual pages are generated without requiring anything beyond one rule in Html.make and one in man/Makefile. An added advantage is that any errors which occur while generating them result in the corresponding module being listed in the error.log file.

    As it stands, I'm inclined to revert most of https://trac.osgeo.org/grass/changeset/52656 (other than the fixes to v.in.ogr.html, which should have been a separate commit). Also https://trac.osgeo.org/grass/changeset/52459, unless there's some other use for it.

    landam commented 5 years ago

    Comment by neteler on 29 Aug 2012 07:56 UTC Replying to [comment:32 glynn]:

    The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

    While well defined, it is much easier and not that invasive to directly render the module parameters/flags descriptions in ReST.

    IMHO, it's preferable to cluttering up the build system with ReST-specific features.

    A Makefile guru may well see that the current approach could be simplified (rather than ditched).

    The usage of Sphinx offers capabilities we cannot achieve in a different way from the current HTML documentation. And the current HTML core pages will remain as before, just an additional output is rendered:

    HTML core page (as present) --+
                                  |
    g.parser --> HTML --------.---+---> HTML as currently
    
    HTML core page (as present) --+
                                  |
    g.parser --> REST ------------+---> pandoc ---> Sphinx ---> additional alternative
                                                                HTML with search
    landam commented 5 years ago

    Comment by hellik on 29 Aug 2012 08:41 UTC Replying to [comment:33 neteler]:

    Replying to [comment:32 glynn]:

    The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

    While well defined, it is much easier and not that invasive to directly render the module parameters/flags descriptions in ReST.

    maybe related:

    http://lists.osgeo.org/pipermail/grass-commit/2012-August/023889.html

    Log: Make --html-description output easier to parse Add ReST generator

    landam commented 5 years ago

    Comment by glynn on 29 Aug 2012 18:52 UTC Replying to [comment:33 neteler]:

    The output of the --interface-description switch has a well-defined format, so the relevant information can readily be extracted from the generated HTML file e.g. using a Python script based upon tools/g.html2man/html.py.

    While well defined, it is much easier and not that invasive to directly render the module parameters/flags descriptions in ReST.

    IMHO, it's preferable to cluttering up the build system with ReST-specific features.

    A Makefile guru may well see that the current approach could be simplified (rather than ditched).

    The current approach mirrors the mechanism used to generate the HTML files, which is significantly more involved than the mechanism used to generate the manual pages from the completed HTML files. If we can generate the ReST files directly from the completed HTML files (and there's no fundamental reason why we can't), it would simplify the build process somewhat.

    In https://trac.osgeo.org/grass/changeset/52956, I've modified the --html-description output to make it easier to parse (adding DIV tags around various sections) and added a script to generate ReST output from the completed HTML pages.

    landam commented 5 years ago

    Comment by glynn on 20 Sep 2012 07:42 UTC Replying to [comment:32 glynn]:

    As it stands, I'm inclined to revert most of https://trac.osgeo.org/grass/changeset/52656

    Done in https://trac.osgeo.org/grass/changeset/53240.

    I've kept the v.in.ogr.html fixes, as well as the various Python scripts (which aren't being used), but reverted the Makefile changes.

    If you need help on doing this correctly (i.e. like how the manual pages are built), or additional changes to the HTML format, please ask.

    landam commented 5 years ago

    Comment by neteler on 4 Nov 2012 19:52 UTC Replying to [comment:36 glynn]:

    I've kept the v.in.ogr.html fixes, as well as the various Python scripts (which aren't being used), but reverted the Makefile changes.

    For the record, the topics have been reinstated in https://trac.osgeo.org/grass/changeset/53525 and https://trac.osgeo.org/grass/changeset/53526.

    If you need help on doing this correctly (i.e. like how the manual pages are built), or additional changes to the HTML format, please ask.

    The topics page (http://grass.osgeo.org/grass70/manuals/html70_user/topics.html) should become two or three column...

    landam commented 5 years ago

    Comment by timmie on 6 Apr 2013 11:33 UTC So where are we now?

    • Keeping the HTML docs are apparently the consence
    • The current doc are not searchable
    • So can we user the technology behind the Sphinx search be used for GRASS?
    • What about other approaches like Whoosh [1]?

    Ideally, the search would be on the website but also in the wxGui.

    [1] http://pythonhosted.org/Whoosh/intro.html

    landam commented 5 years ago

    Modified by timmie on 6 Apr 2013 11:38 UTC

    landam commented 5 years ago

    Comment by wenzeslaus on 6 Mar 2014 03:43 UTC Replying to [comment:38 timmie]:

    So where are we now?

    I'm interested too, the last commit linked here is a revert (https://trac.osgeo.org/grass/changeset/53240).

    Keeping the HTML docs are apparently the consence

    Of course, this is the format how it is stored and the resulting pages can be even using some JavaScript or some additional processing during compilation (this wasn't really explored so far).

    The current doc are not searchable

    Using Sphinx would be a nice workaround to get time to solve our custom search.

    So can we user the technology behind the Sphinx search be used for GRASS?

    Wouldn't this be much harder than using Sphinx and our HTML together? Sphinx can still be better for Python developers while HTML would be for other users?

    I guess that using Sphinx parts would be more difficult than using some standalone package (but really just guessing).

    What about other approaches like Whoosh?

    And what about some JavaScript solutions at least for keywords, labels, descriptions and names?

    Ideally, the search would be on the website but also in the wxGui.

    There is a different search in the wxGUI in ''Layer Manager'', ''Search modules'' tab. You can search the module according to keywords, label, description and name (all at once). To get the documentation you currently have to open the module dialog/form and go to ''Manual'' tab. Better way would be to open manual page directly from the ''Search modules'' tab. Similar think is implemented in the extension manager/addons installer. And to the searchable manual pages in GU, I'm not sure what would be the easiest way to implement this.

    ''PS: [1] is interpreted by Trac as changeset link while http://abc.org Abc is interpreted as link with text. And note that Trac syntax for bullet lists is "space-star-space":''

     * dsd
     * sdsasd

    ''It would be really great to have http://trac.osgeo.org/osgeo/ticket/592 solved, so we would see the live preview of the ticket (instead of pressing Preview button at the bottom of the page).''

    landam commented 5 years ago

    Comment by wenzeslaus on 15 Feb 2015 02:36 UTC As I was saying, perhaps some JavaScript which would go through some JSON or XML file would be enough? Search could be graphically incorporated in the same way as TOC. The JSON/XML file would be generated during build and would contain name, label, description and keywords for each module. This wouldn't be full text but it is good enough. It works well (enough) in ''Search modules'' tab in wxGUI.

    This search could be attached to each page but there could be also a separate page. However, this wouldn't work so well I think (it would be less prominent). The file with the metadata would be quite large but with today's web and loading on demand it could work. Some special care would have to be done for the local pages.

    Somebody interested in some JavaScript development?

    landam commented 5 years ago

    Modified by @landam on 12 May 2016 06:42 UTC

    landam commented 5 years ago

    Comment by @landam on 5 Jun 2016 10:03 UTC There were some attempts for sphinx support, changing milestone to 7.2

    landam commented 5 years ago

    Comment by neteler on 28 Dec 2016 15:04 UTC Ticket retargeted after milestone closed

    landam commented 5 years ago

    Comment by wenzeslaus on 4 Mar 2017 18:08 UTC The current state:

    As mentioned above, the ''Modules'' tab supports search in keywords, names and descriptions. In trunk (for 7.4), there is also ''Advanced search'' button which open https://grass.osgeo.org/grass77/manuals/g.search.modules which is available since 7.2 and can do full text search in manual pages. .htmlAlso, there is Google search on the main website which needs change from http:// to https:// and searches through the website and all documentation.

    https://grass.osgeo.org/documentation/search-engine

    Sphinx is used for Python documentation and has search (which has full text):

    https://grass.osgeo.org/grass72/manuals/libpython

    Doxygen is used for C documentation which also has its own search (which doesn't have full text):

    https://grass.osgeo.org/programming7

    landam commented 5 years ago

    Comment by neteler on 4 Mar 2017 18:16 UTC Replying to [comment:45 wenzeslaus]:

    Also, there is Google search on the main website which needs change from http:// to https:// and searches through the website and all documentation.

    Done.

    https://grass.osgeo.org/documentation/search-engine

    landam commented 5 years ago

    Comment by neteler on 26 Jan 2018 11:40 UTC Ticket retargeted after milestone closed

    landam commented 5 years ago

    Modified by neteler on 12 Jun 2018 20:48 UTC

    landam commented 5 years ago

    Modified by @landam on 25 Sep 2018 12:24 UTC