Vindaar / MHB_test

1 stars 1 forks source link

This is a simple example of what the future Modulhandbuch will look like. It's hosted using Github pages:

[[https://vindaar.github.io/MHB_test]]

** Project structure

There is one "Modulhandbuch" for each degree and for each "Prüfungsordnung". The list of current hand books is:

For each degree there is one subdirectory in the =content= directory starting with =mhb_= where == is the notation given above.

The structure of each Modulhandbuch is represented in a tree of directories containing containing =_index.md= markdown files. For all courses there is another subdirectory, which then again contains such a markdown file.

In Hugo these =_index.md= files are called [[https://gohugo.io/templates/lists/][list files]], i.e. their goal normally would be to list all posts found in a given directory. In our use case we (ab-)use them to map our module / course structure directly to a Hugo structure.

TODO: an alternative / possibly an improvement would be to use a regular "post" file (i.e. a named markdown file within a directory e.g. =physik110=, which contains =physik111.md=).

An example tree as such looks like:

+begin_src sh

basti at void in ~/src/tinyIap/staticExample/content/docs/mhb_bsphysik ツ tree .
├── _index.md
├── math140
│   ├── _index.md
│   └── math141
│   └── _index.md
├── math240
│   ├── _index.md
│   └── math241
│   └── _index.md ├── math340
│   ├── _index.md │   └── math341
│   └── _index.md ├── physik110
│   ├── _index.md
│   ├── ph110.pdf
│   ├── physik111
│   │   └── _index.md
│   └── physik112 │   └── _index.md ...

+end_src

*** Data and page generation

Each of these =_index.md= files itself is essentially empty, aside from a few pieces of information about the name of the module/course and some optional tags.

Consider the list file of the =physik110= module:

+begin_src markdown

+++ weight = 10 title = "physik110" degree = "bsphysik" tags = ["bsphysik", "physik110"] categories = ["module"] +++

{{< genModulePage >}}

+end_src

The part within the =+++= is called the [[https://gohugo.io/content-management/front-matter/]["front matter"]] in Hugo and it may contain metadata about each file. The fact that it starts and stops using =+++= indicates that the front matter body is written in [[https://github.com/toml-lang/toml][TOML]] (Hugo supports TOML, YAML, JSON and Org mode).

A short explanation of the fields present:

Here the =tags= and =categories= fields are [[https://gohugo.io/content-management/taxonomies/][taxonomies]] in Hugo terms. These essentially just create auto generated pages, which list all pages that have the same value (same tag or same category). These two are default taxonomies, additional ones can be added.

Finally, the line containing ={{< genModulePage >}}= is a [[https://golang.org/pkg/text/template/][Go text template]] used for string interpolation, which Hugo is essentially built on top of by using such string interpolation code to build HTML pages from individual snippets and template functions. In particular a template showing up in a markdown file is a [[https://gohugo.io/content-management/shortcodes/][Hugo shortcode]] (in this case a [[https://gohugo.io/templates/shortcode-templates/][custom shortcode]]).

From a programming perspective shortcodes and in general Go templates can be considered a crude programming language. A shortcode is a function, which can take arguments and returns markdown or HTML strings.

Shortcodes however are pretty restricted. The majority of Hugo templating is done via full [[https://golang.org/pkg/html/template/][HTML focussed templates]] (a Go library to build HTML using the above mentioned Go text templates).

*** TOML data files

The markdown files introduced in the previous section need to be filled with data. The shortcodes used inside the markdown file internally call multiple Hugo templates. These templates get their data from the TOML files stored in [[./data/]].

From highest to lowest level, the TOML abstraction is as follows.

**** 'Prüfungsordnung' course map

Each 'Prüfungsordnung' (PO) has a file with the filename:

+begin_src

_course_map.toml #+end_src where == is the short name of the PO. Three different POs are defined as of now: - po2006: 'Prüfungsordnung' of the year 2006 - po2014: 'Prüfungsordnung' of the year 2014. Each degree in it is suffixed by a =2=. - other: list for all courses not part of the physics degree Each of these TOML files simply lists the different degrees in that PO and contains a list of all *courses* (not modules!) in that degree. For example a shortened version of the =po20016_course_map.toml= file is: #+begin_src toml Degrees = ["BSPHYSIK2", "MSPHYSIK2", "MSASTRO2"] BSPHYSIK2 = ["math241", "..."] MSASTRO2 = ["astro608", "..."] MSPHYSIK2 = ["physics606", "..."] #+end_src where =Degrees= is the degrees part of this and the individual elements for each degree simply list all courses. The course map on the level of the PO is necessary, as courses can appear in multiple degrees. **** Degree TOML files Each degree in each PO has 3 additional TOML files. These are of the following file name structure: - =mhb_.toml=: contains a list of all *modules* in the degree and the data for each module. - =mhb__courses.toml=: contains a list of all *courses* and the data for each course. - =mhb__langmap.toml=: contains the mapping of the data field names used in the module and course TOML files to names used in the generated HTML. Normally used to differentiate between German and English depending on the degree (hence the =langmap= suffix). ***** Module TOML file The basic structure is as follows, consider the beginning of the =BSPHYSIK= (PO 2006) file: #+begin_src TOML ModuleList = ["physik110", "physik120", "..."] # list of *all* modules in this degree & in this file [physik110] # name of the module as it appears in the `ModuleList`, starts a TOML sub table mfDegree = "BSPHYSIK" # must match a string found in the PO course map mfDegreeLong = "B.Sc. in Physik (PO von 2006)" # long version of degree (any string) mfTitle = "Physik I (Mechanik, Wärmelehre)" mfNum = "physik110" # e.g. physik110, must be the same as the table name (in brackets [] above) mfCP = "10" # number of credit points for this module mfCategory = "Pflicht" # category, compare with other modules, any value mfSemester = "1.-2." # which semester this module belongs to mfRequirements = "" mfPreparation = "" mfContent = "Mechanik-Grundlagen mit Demonstrationsversuchen, Mechanik des Massenpunktes, deformierbare Medien, Vielteilchensysteme, Wärmelehre, Relativistische Aspekte. Dazu 6 Praktikumsversuche" mfGoals = "Einarbeitung in die Mechanik und die Wärmelehre; Erarbeitung der Phänomenologie in Vorbereitung auf den theoretischen Unterbau" mfFormalities = '''physik111: Zulassungsvoraussetzung zur Modulteilprüfung (Klausur oder mündliche Prüfung): erfolgreiche Teilnahme an den Übungen. physik112: Zulassungsvoraussetzung zur Modulteilprüfung (Klausur oder mündliche Prüfung): erfolgreiche Bearbeitung der Versuchsprotokolle, mündliche Überprüfung der Versuchsvorbereitung und Durchführung der Versuche ''' mfLength = "2 Semester" # length in semesters of this course mfParticipants = "ca. 200" mfSignup = "s. https://basis.uni-bonn.de u. http://bamawww.physik.uni-bonn.de" mfNotes = "" mfOrder = 110.0 # should match the `weight` field in the markdown file! Menu sorting order depends on this. CourseList = ["physik111", "physik112"] # all courses *part of this module* ["..."] * #+end_src Most of these fields are pretty self explanatory. Few are mandatory specific strings, as mentioned. The prefix of the field names =mf= stands for "module field". These names are mapped to proper string names in the language map, see below. ***** Course TOML file The course TOML file for a degree follows essentially the same structure as the module field, with the only difference that the fields are different and start with a prefix =cf=, "course field". Again another example from the =BSPHYSIK= (PO 2006) degree: #+begin_src TOML CourseList = ["physik131", "math141", "..."] # all courses listed in this degree & in this file [physik131] # name of the course as it appears in the `CourseList`, starts a TOML sub table cfTitle = "EDV für Physiker und Physikerinnen" # title of he course cfNum = "physik131" # name of the course again, as a field cfCP = "4" # number of credit points for this course cfWorkload = "1+2" cfKind = "Vorlesung mit Übungen" cfCategory = "Pflicht" cfLanguage = "deutsch" cfRequirements = "" cfPreparation = "" cfFormalities = "Zulassungsvoraussetzung zur Modulprüfung (Abschlussbericht oder Klausur): erfolgreiche Teilnahme an den Übungen" cfLength = "1 Semester" cfGoals = "Die Studierenden sollen mit Betriebssystemen vertraut gemacht werden, moderne Editierprogramme kennen lernen, gezielt lernen Webrecherchen durchzuführen und erste Schritte mit einer Programmiersprache machen. Die Lehrveranstaltung ist praxisbezogen und liefert damit eine solide Grundlage für den Umgang mit Rechnern im weiteren Studium" cfContent = "Betriebssysteme: Linux, UNIX; Editierprogramme: emacs, vi; LaTeX, TeX; Postscript, ghostview, PDF; Algebrasysteme: Maple, Mathematica; Programmiersprache: C++; Plotprogramme: gnuplot, root; shellscripts; Tabellenkalkulation; Web: effiziente Recherchen, Deutung von Webadressen, Einblick in HTML" cfLiterature = "Es werden kompakte Anleitungen zur Verfügung gestellt" cfKindShort = "Vorl. + Üb." cfSemester = "WS" # which semester this course is given in cfLecturer = "Dozentinnen und Dozenten der Physik und Astronomie" cfMail = '''exphysik@uni-bonn.de theophysik@uni-bonn.de astro@uni-bonn.de ''' cfOrder = 10.0 # field similar to `mfOrder`, affects ordering in menu & overview table showing courses in table ["..."] * #+end_src ***** Language map TOML file The =mhb__langmap.toml= simply maps the field names found in the module and course TOML files to properly formatted strings to be inserted into the HTML. An example file, again for =BSPHYSIK= (PO 2006): #+begin_src TOML Degree = "BSPHYSIK" [ModuleFields] mfDegree = "Studiengang" mfDegreeLong = "" mfTitle = "Modul" mfNum = "Modul-Nr." mfModules = "Module" mfCP = "Leistungspunkte" mfCategory = "Kategorie" mfSemester = "Semester" mfParts = "Modulbestandteile" mfCourseTitle = "LV Titel" mfCourseNum = "LV Nr" mfCourseKind = "LV-Art" mfCourseLP = "LP" mfTotalWorkload = "Aufwand" mfCourseSemester = "Sem" mfRequirements = "Zulassungsvoraussetzungen" mfPreparation = "Empfohlene Vorkenntnisse" mfContent = "Inhalt" mfGoals = "Lernziele/Kompetenzen" mfFormalities = "Prüfungsmodalitäten" mfLength = "Dauer des Moduls" mfParticipants = "Max. Teilnehmerzahl" mfSignup = "Anmeldeformalitäten" mfNotes = "Anmerkung" mfKindShort = "LV-Art" mfOrder = "Reihenfolge" [CourseFields] cfTitle = "Lehrveranstaltung" cfNum = "LV-Nr." cfCP = "LP" cfWorkload = "SWS" cfKind = "LV-Art" cfCategory = "Kategorie" cfLanguage = "Sprache" cfRequirements = "Zulassungsvoraussetzungen" cfPreparation = "Empfohlene Vorkenntnisse" cfFormalities = "Studien- und Prüfungsmodalitäten" cfLength = "Dauer der Lehrveranstaltung" cfGoals = "Lernziele der LV" cfContent = "Inhalte der LV" cfLiterature = "Literaturhinweise" cfKindShort = "LV-Art" cfSemester = "Semester" cfLecturer = "Dozenten" cfMail = "email" cfOrder = "Reihenfolge" cfUseFor = "Verwendung" #+end_src As we can see, it simply maps the field maps to nicer names that are inserted into the corresponding =cf/mf= places in the Hugo templates. *** Adding a new module / course by hand In order to add a new course or module by hand, simply the corresponding elements in the above data structure has to be added by hand. **** Adding a new module To add a new module, the following steps have to be done: 1. Add the new module name to the =ModuleList= entry in the =mhb_.toml= file (first line), e.g. =MyNewModule=. 2. Add a new TOML table to the same TOML file by adding a =[MyNewModule]= with all fields and their appropriate content. 3. Add a new markdown file in =./content/docs//mhb_/MyNewModule.md= with the contents as described further up. Once this is done, rebuilding the site with Hugo should result in a new module. **** Adding a new course Adding a new course is almost the same as adding a new module, but requires one more step. 1. Add the new course to the =_course_map.toml= file, adding it to all degree lists in which the course may appear. 2. Add the new course to *all* =mhb__courses.toml= files, in which the course will appear. If it's for a single degree, only a single file needs to be modified. In these files add the new course =MyNewCourse= to the =CourseList= in the first line. 3. Then add a new table =[MyNewCourse]= to each file and fill the fields as needed. 4. Add new markdown files in =./content/docs//mhb_/courses/MyNewCourse.md=, where you just need to make sure to adjust the degree in each subdirectory accordingly. After this, a rebuild with Hugo should show a new course. *** TODO Things still to be explained - finish the above: - partial templates: what are they, where are they - distinguish genCoursePage and genModulePage - give brief idea about how they work ** PDF generation PDF generation is done using [[https://pandoc.org][pandoc]] from the final HTML pages generated by Hugo. Because these pages contain all sorts of additional information that is not of interest for the PDF of a module/course. Therefore we extract the =
= tag from each page and hand only that to =pandoc=. To perform the tag extraction we use =xmllint=. For a given HTML file =fname=: #+begin_src sh xmllint --nowarning --html --xpath '/html/body/main/div/article' 2> /dev/null #+end_src Due to some technically invalid tags (duplicates) in the generated HTML files, we dump =stderr=. Furthermore, the default TeX template used by =pandoc= uses a too wide margin. The =mhb_pandoc_template.latex= file is a slightly modified version of =pandoc='s default TeX template (which can be [[https://pandoc.org/MANUAL.html#templates][extracted using]] =pandoc -D latex=) with a margin of 2 cm. Conversion of a single page is thus: #+begin_src sh xmllint --nowarning --html --xpath '/html/body/main/div/article' | \ pandoc -f html -t latex --template mhb_pandoc_template.latex -o .pdf #+end_src which will generate an =.pdf= of the module/course. For the full command to generate all PDFs see the [[./.github/workflows/gh-pages.yml][Github action workflow]]. ** Neat features Just a few features I personally like, cause I would have enjoyed them back when I was a student or things that could improve the module handbook in some ways. - working search functionality (yay!) - multi language support for the full website, so one could in principle even offer e.g. the B.Sc. handbooks with English headers or the M.Sc. handbooks with German headers (and theoretically content of couse) - KaTeX (similar to MathJax) for inline LaTeX equations (well, probably not too important for the module handbook, unless something in the descriptions) - dark theme ** Some notes Generate the table of contents from the database when converting to generate something like the following table of content. *UPDATE*: We do not even have to manually generate the sub trees. Only the degrees have to be manually added. The rest is done automagically. #+begin_src markdown - [**B.Sc. Physik**]({{< relref "/docs/mhb_bsphysik" >}}) - [**M.Sc. Physics**]({{< relref "/docs/mhb_msphysik" >}}) - [**Module**]({{< relref "/docs/module" >}}) - [Modul 1]({{< relref "/docs/module/modul1" >}}) - [Course 1.1]({{< relref "/docs/module/modul1/course1" >}}) - [Course 1.2]({{< relref "/docs/module/modul1/course2" >}}) - [Modul 2]({{< relref "/docs/module/modul1" >}}) - [Course 2.1]({{< relref "/docs/module/modul1/course1" >}}) #+end_src where the module / course pages are generated in the same way as the PDF is currently. Module / course structure is represented by the directory structure. Each page has the actual content that is found in the PDF then. PDF creation can be achieved reasonably well already with a default =pandoc= call: #+begin_src sh pandoc _index.md -o .pdf #+end_src will create a single PDF for a module, which looks really good. See number 14 here: https://pandoc.org/demos.html for an idea how we might generate a full handbook for all pages (e.g. we walk the =content= directory and append the content of each found markdown file to a single one and use a custom TeX header). With that we should be able to generate a beautiful PDF, which can then be injected with things like the page of the overview etc using =pdfunite, pdftk= or similar.