open-simh / simh

The Open SIMH simulators package
https://opensimh.org/
Other
487 stars 93 forks source link

Would you accept a PR to port the docs to a more approachable format? #6

Open Alhadis opened 2 years ago

Alhadis commented 2 years ago

This is something I've been itching to do for years: replace those ugly binary files with files written in a lightweight markup language like reStructuredText or AsciiDoc (both of which are rendered like Markdown on GitHub).

This wouldn't be an automated conversation, as tools like Pandoc have (at least in my experiments) discarded structurally relevant info that I assume is represented internally in the .doc files in some purely-presentational fashion.

Aside from the obvious benefits of diffing and maintainability, it also permits nifty features GitHub adds to its rendered markup, like section navigation menus and even diagrams.

I'm happy to do the heavy lifting if the answer to the above is an affirmative.

pkoning2 commented 2 years ago

FWIW, here's an announcement that just went out on the GCC mailing list:

Hi.

Tomorrow in the morning (UTC time), I'm going to migrate the documentation
to Sphinx. The final version of the branch can be seen here:

$ git fetch origin refs/users/marxin/heads/sphinx-final
$ git co FETCH_HEAD 

URL: https://splichal.eu/gccsphinx-final/

TL;DR;

After the migration, people should be able to build (and install) GCC even
if they miss Sphinx (similar happens now if you miss makeinfo). However, please
install Sphinx >= 5.3.0 (for manual and info pages - only *core* package is necessary) [1]

Steps following the migration:

1) update of web HTML (and PDF documentation) pages:
  I prepared a script and tested our server has all what we need.
2) gcc_release --enable-generated-files-in-srcdir: here I would like
  to ask Joseph for cooperation
3) URL for diagnostics (used for warning) - will utilize [3]
4) package source tarballs - https://gcc.gnu.org/onlinedocs/ (listed here)
5) updating links from [gcc.gnu.org](http://gcc.gnu.org/) that point to documentation
6) removal of the further Texinfo leftovers
...

Cheers,
Martin

[1] https://splichal.eu/scripts/sphinx/gccint/_build/html/source-tree-structure-and-build-system/the-gcc-subdirectory/building-documentation.html#sphinx-install
[2] ./maintainer-scripts/update_web_docs_git.py
[3] https://pypi.org/project/sphinx-redirect-by-id/
Alhadis commented 1 year ago

Alright, great news: I've finished porting each of the documents to reStructuredText format, which you can see at my fork.

I still need to setup a pipeline and document the procedures necessary to rebuild the docs from a local checkout. In the meantime, everybody is welcome to review the .rst files and give their feedback.

pkoning2 commented 1 year ago

Very nice! I used sphinx-quickstart to cobble up a conf.py file, then tried a few sphinx-build flavors. HTML works fine and looks good. EPUB ditto. "latexpdf" gave oodles of error messages about bad UTF-8 codes, and when I bypassed those it produced a PDF that looked reasonably ok but likely will need work. Actually, for PDF I would assume the better answer is a PDF per top level RST file (like the separate HTML files), not a single file containing everything.

Did you create these files by hand or use a conversion tool? I have some docs for other projects that I could do in RST format as well. (I originally used DOC and later converted them to LyX (LaTeX basically) which isn't as convenient and not really necessary given the content.)

pkoning2 commented 1 year ago

BTW, some output flavors don't build. "man" gives me an exception that seems to be an internal error, and "text" complains about nested tables that aren't supported. "info" works but gives some warning messages about bad cross reference names.

pkoning2 commented 1 year ago

Re my earlier question: I found "pandoc" to do conversion to RST format, it does a decent job.

There's a structure question we need to figure out. Sphinx thinks of the entire set of RST files for a project as a single document. With "html" output you get multiple output files, but that's only to allow for partial downloading. When you read the document you still get the appearance of a single manual (the top level content given by index.rst). This works, but the content looks a bit strange. For example, right now each document contains a copy of the license. For a stand-alone document that may be reasonable; for a chapter of a big document it is redundant.

It may make sense to do some grouping, in other words make a two level TOC with top level entries for the different manufacturers, so all DEC emulators appear underneath an entry for DEC, and ditto for IBM and HP.

Alhadis commented 1 year ago

BTW, some output flavours don't build.

I know. The build pipeline will address the issues with preparing non-HTML output. Part of this involves on-the-fly conversion of GitHub-optimised .rst files into something more robust, Sphinx-friendly, and format agnostic. This everted approach is a little circuitous, but it guarantees the most readable output from docutils(1), sphinx-build(1), and GitHub's markup-rendering sandbox.

When you read the document you still get the appearance of a single manual (the top level content given by index.rst)

That's… an issue we've yet to discuss. The original SIMH manuals were written in a style best described as "monolithic"; there's a lot of repeated material between documents (often with annoyingly-subtle discrepancies), and references to other manuals take the form of a cited section name or number. Clearly, these were written with physical output in mind (PDF viewers notwithstanding), so in order to restructure these docs to be more… website-ish… they'll require major editorial changes: something which I've deliberately avoided for the time being.

For example, right now each document contains a copy of the license. For a stand-alone document that may be reasonable; for a chapter of a big document it is redundant.

Yeah, that ties in with my earlier observation about the manuals being written "in isolation"; that is, in a format that remains as useful to a reader online as it would in physical, hardcopy form. Obviously, that runs counter to web-publishing practices…

Did you create these files by hand or use a conversion tool?

By hand. That's kinda why it's taken me this long. 😅

bscottm commented 8 months ago

@Alhadis: What ever happened to this effort?

Alhadis commented 8 months ago

What ever happened to this effort?

@bscottm It was on-hold for most of 2023 as I dealt with real-life issues. Sorry, I'm back now and intend on finishing this damn thing ASAP. 😞