ebu / benchmarkstt

Open Source AI Benchmarking toolkit for benchmarking speech to text services
MIT License
54 stars 8 forks source link

Add UML generation, cleanup + add documentation #134

Closed MikeSmithEU closed 4 years ago

MikeSmithEU commented 4 years ago

Why?

Greatly improves upon the following guiding principles going forward:

(Ref: https://github.com/ebu/benchmarkstt/wiki/Principles)

Changes and additions

Chronologic todos

MikeSmithEU commented 4 years ago

Generated docs available at https://benchmarkstt.readthedocs.io/en/uml-schema (hidden)

MikeSmithEU commented 4 years ago

Hey Mike! Thanks for these changes, the diagrams are great! It's good to have you back :)

I've added a few comments, and a couple of extras here:

  • I didn't have PlantUML installed, so I think installation instructions should be added to development.rst, under "Building the Documentation". You could mention it's available via homebrew (brew install plantuml) or via https://plantuml.com/download.
  • Since the UML files are automatically generated, I wonder whether they should be included in git? I think I'd prefer to see docs/_static/uml in the .gitignore instead (and create the uml directory in the Makefile)

It is my intention to remove the plantuml -> svg from this PR. I agree that the uml directory should be .gitignored. I am currently having issues generating the svg's on readthedocs.org though, and have spent quite a lot of time looking for alternate solutions. The only reasonable one I've found thus far is http://www.gravizo.com/

Simply documenting how to install plantuml is not feasibly imho (or at least, I won't do it), as it would need to cover Windows, OSX and Linux targets, making sure a proper version of java is installed and provide ongoing support for this.

I've now converted this PR to draft until I can implement that, but it gets me to quite a side tangent as this is far outside the scope of this project, and the cleanest is writing a new sphinx plugin for this. I am really short on free time at the moment so this is appended to my long todo list. To be continued...

MikeSmithEU commented 4 years ago

@danielthepope Embed PlantUML, DOT, etc. diagrams in documentation using Gravizo now supported, and separated it from benchmarkstt.

https://pypi.org/project/sphinxcontrib-gravizo/

Examples Inline graph, show as png:

.. gravizo:: png
    @startuml
    Alice -> Bob: Authentication Request
    Bob --> Alice: Authentication Response

    Alice -> Bob: Another authentication Request
    Alice <-- Bob: Another authentication Response
    @enduml

Load from a file, show as svg:

.. gravizo:: ./path/to/graph.puml svg
MikeSmithEU commented 4 years ago

Docs build of uml files is now completely separated. Gravizo plugin moved to sphinxcontrib organization and added as a dependency for docs. All svg- and puml-files are no longer polluting this repository.

Since there are a few big graphs that cannot be directly generated by a GET-request to Gravizo (request uri too long as per HTTP RFC) I still need to implement a work-around for that.

MikeSmithEU commented 4 years ago

That's very cool how you've managed to make the diagrams generate in the front end! I wonder if I'm doing something wrong though, or maybe I've spotted a problem?

After generating the docs, running cddocs/build/html;python3 -m http.server I go to http://localhost:8000/modules/benchmarkstt.api.html.

A diagram doesn't show up and there's an error in the browser's console

Uncaught Error: Parse error on line 1:
classDiagram
------------^
Expecting 'NEWLINE', got 'EOF'

Sorry for the late reply, I've been indisposed for a couple of days.

Indeed, this is because for those modules there are no public classes available, so it returns an "empty" classDiagram, i.e. an unexpected EOF. I will have a look if I can either pre-check this (probably will result in messy code fixes), or contribute the functionality to "skip if empty" to the original sphinxcontrib repository (seems the better idea, but does not assure acceptance by the repo maintainer(s)). I feel this can be ignored for the time being as it does not really affect the users of the documentation.

MikeSmithEU commented 4 years ago

I feel this can be ignored for the time being as it does not really affect the users of the documentation.

Fair enough, that's fine then.

I've just realised that because you're using this new way of generating UML, you no longer need the autogen folder, so you can delete the docs/_static/autogen/.keep file.

After that, this will be good to merge 😄

Good catch, I've been experimenting with generating the svg's server side though, and thought it might still be useful (turns out even then it's not needed). Cleaned them up in the latest commit.

The svg generating server side is quite problematic it seems. It spews all kinds of weird syntax errors on the mermaid code (which does render properly using the client-side renderer or mermaid live preview). Some are related to html escaping (which I solved locally), others I haven't been able to figure out yet. Because I spent way too much time on it already I will keep it as-is. (edit: turns out I was using the wrong package, mermaid.cli != @mermaid-js/mermaid-cli)

If we can agree on the usage of yarn, I suggest this to be the ready-to-merge PR, pending approval from you. edit: Factored out yarn

Edit: let's consider this a near-ready-to-merge PR, there are a few issues I still need to solve. I.e. the textwrapping of the __init__ arguments in the diagram shows some issues that I need to fix. And there is some self-referencing arrows in the diagrams that shouldn't be there. Both these issues should be quite easy to fix (requiring much less time and effort than any of the experiments I've done before to reach this point). When these two remaining issues are taken care of (by the end of the week for sure), (edit: fixed) I will merge once you approve of my changes.

MikeSmithEU commented 4 years ago

@danielthepope This should be ready to merge once you approve...

MikeSmithEU commented 4 years ago

Factored out yarn