- Simple example for the Modulhandbuch
This is a simple example of what the future Modulhandbuch will look
like. It's hosted using Github pages:
[[https://vindaar.github.io/MHB_test]]
** Project structure
There is one "Modulhandbuch" for each degree and for each
"Prüfungsordnung". The list of current hand books is:
- =BSPHYSIK=: B.Sc. of Physics (Prüfungsordngung of 2006)
- =BSPHYSIK2=: B.Sc. of Physics (Prüfungsordngung of 2014)
- =MSPHYSIK=: M.Sc. of Physics (Prüfungsordngung of 2006)
- =MSPHYSIK=: M.Sc. of Physics (Prüfungsordngung of 2014)
- =MSASTRO=: M.Sc. of Astrophysics (Prüfungsordngung of 2006)
- =MSASTRO2=: M.Sc. of Astrophysics (Prüfungsordngung of 2014)
- =LVANDERE=: Lehrveranstaltungen anderer Fächer
- contains physics modules that may be taken by students as the
first elective module (TODO: what's the term for those modules?)
- contains physics modules people from other degrees might take
For each degree there is one subdirectory in the =content= directory
starting with =mhb_= where == is the notation given above.
The structure of each Modulhandbuch is represented in a tree of directories
containing containing =_index.md= markdown files. For all courses there is
another subdirectory, which then again contains such a markdown file.
In Hugo these =_index.md= files are called [[https://gohugo.io/templates/lists/][list files]], i.e. their goal normally
would be to list all posts found in a given directory. In our use case we
(ab-)use them to map our module / course structure directly to a Hugo structure.
TODO: an alternative / possibly an improvement would be to use a regular "post"
file (i.e. a named markdown file within a directory e.g. =physik110=, which
contains =physik111.md=).
An example tree as such looks like:
+begin_src sh
basti at void in ~/src/tinyIap/staticExample/content/docs/mhb_bsphysik ツ tree
.
├── _index.md
├── math140
│ ├── _index.md
│ └── math141
│ └── _index.md
├── math240
│ ├── _index.md
│ └── math241
│ └── _index.md
├── math340
│ ├── _index.md
│ └── math341
│ └── _index.md
├── physik110
│ ├── _index.md
│ ├── ph110.pdf
│ ├── physik111
│ │ └── _index.md
│ └── physik112
│ └── _index.md
...
+end_src
*** Data and page generation
Each of these =_index.md= files itself is essentially empty, aside from a few
pieces of information about the name of the module/course and some optional
tags.
Consider the list file of the =physik110= module:
+begin_src markdown
+++
weight = 10
title = "physik110"
degree = "bsphysik"
tags = ["bsphysik", "physik110"]
categories = ["module"]
+++
{{< genModulePage >}}
+end_src
The part within the =+++= is called the [[https://gohugo.io/content-management/front-matter/]["front matter"]] in Hugo and it may
contain metadata about each file. The fact that it starts and stops using =+++=
indicates that the front matter body is written in [[https://github.com/toml-lang/toml][TOML]] (Hugo supports TOML,
YAML, JSON and Org mode).
A short explanation of the fields present:
- =weight=: the associated weight for this module. The weights are used to
determine the order of the modules in the menu
TODO: assign correct weights to all modules!
- =title=: The name of the module / course
- =degree=: the name of the degree in the notation from above (e.g. =BSPHYSIK=)
in lowercase letters
- =tags=:
- for modules: the degree and the module name
- for courses: the degree, the course name and its parent module
- =categories=: the kind of this file: a =module= or a =course=
Here the =tags= and =categories= fields are [[https://gohugo.io/content-management/taxonomies/][taxonomies]] in Hugo terms. These
essentially just create auto generated pages, which list all pages that have the
same value (same tag or same category). These two are default taxonomies,
additional ones can be added.
Finally, the line containing ={{< genModulePage >}}= is a [[https://golang.org/pkg/text/template/][Go text template]] used
for string interpolation, which Hugo is essentially built on top of by using
such string interpolation code to build HTML pages from individual snippets and
template functions.
In particular a template showing up in a markdown file is a [[https://gohugo.io/content-management/shortcodes/][Hugo shortcode]] (in
this case a [[https://gohugo.io/templates/shortcode-templates/][custom shortcode]]).
From a programming perspective shortcodes and in general Go templates can be
considered a crude programming language. A shortcode is a function, which can
take arguments and returns markdown or HTML strings.
Shortcodes however are pretty restricted. The majority of Hugo templating is
done via full [[https://golang.org/pkg/html/template/][HTML focussed templates]] (a Go library to build HTML using the
above mentioned Go text templates).
*** TOML data files
The markdown files introduced in the previous section need to be
filled with data. The shortcodes used inside the markdown file
internally call multiple Hugo templates. These templates get their
data from the TOML files stored in [[./data/]].
From highest to lowest level, the TOML abstraction is as follows.
**** 'Prüfungsordnung' course map
Each 'Prüfungsordnung' (PO) has a file with the filename:
+begin_src
_course_map.toml
#+end_src
where == is the short name of the
PO. Three different POs are defined as of now:
- po2006: 'Prüfungsordnung' of the year 2006
- po2014: 'Prüfungsordnung' of the year 2014. Each degree in it is
suffixed by a =2=.
- other: list for all courses not part of the physics degree
Each of these TOML files simply lists the different degrees in that
PO and contains a list of all *courses* (not modules!) in that
degree. For example a shortened version of the
=po20016_course_map.toml= file is:
#+begin_src toml
Degrees = ["BSPHYSIK2", "MSPHYSIK2", "MSASTRO2"]
BSPHYSIK2 = ["math241", "..."]
MSASTRO2 = ["astro608", "..."]
MSPHYSIK2 = ["physics606", "..."]
#+end_src
where =Degrees= is the degrees part of this and the individual
elements for each degree simply list all courses.
The course map on the level of the PO is necessary, as courses can
appear in multiple degrees.
**** Degree TOML files
Each degree in each PO has 3 additional TOML files. These are of the
following file name structure:
- =mhb_.toml=: contains a list of all *modules* in
the degree and the data for each module.
- =mhb__courses.toml=: contains a list of all *courses* and
the data for each course.
- =mhb__langmap.toml=: contains the mapping of the data field
names used in the module and course TOML files to names used in the
generated HTML. Normally used to differentiate between German and
English depending on the degree (hence the =langmap= suffix).
***** Module TOML file
The basic structure is as follows, consider the beginning of the
=BSPHYSIK= (PO 2006) file:
#+begin_src TOML
ModuleList = ["physik110", "physik120", "..."] # list of *all* modules in this degree & in this file
[physik110] # name of the module as it appears in the `ModuleList`, starts a TOML sub table
mfDegree = "BSPHYSIK" # must match a string found in the PO course map
mfDegreeLong = "B.Sc. in Physik (PO von 2006)" # long version of degree (any string)
mfTitle = "Physik I (Mechanik, Wärmelehre)"
mfNum = "physik110" # e.g. physik110, must be the same as the table name (in brackets [] above)
mfCP = "10" # number of credit points for this module
mfCategory = "Pflicht" # category, compare with other modules, any value
mfSemester = "1.-2." # which semester this module belongs to
mfRequirements = ""
mfPreparation = ""
mfContent = "Mechanik-Grundlagen mit Demonstrationsversuchen, Mechanik des Massenpunktes, deformierbare Medien, Vielteilchensysteme, Wärmelehre, Relativistische Aspekte. Dazu 6 Praktikumsversuche"
mfGoals = "Einarbeitung in die Mechanik und die Wärmelehre; Erarbeitung der Phänomenologie in Vorbereitung auf den theoretischen Unterbau"
mfFormalities = '''physik111: Zulassungsvoraussetzung zur Modulteilprüfung (Klausur oder mündliche Prüfung):
erfolgreiche Teilnahme an den Übungen.
physik112: Zulassungsvoraussetzung zur Modulteilprüfung (Klausur oder mündliche Prüfung):
erfolgreiche Bearbeitung der Versuchsprotokolle, mündliche Überprüfung der Versuchsvorbereitung und Durchführung der Versuche
'''
mfLength = "2 Semester" # length in semesters of this course
mfParticipants = "ca. 200"
mfSignup = "s. https://basis.uni-bonn.de u. http://bamawww.physik.uni-bonn.de"
mfNotes = ""
mfOrder = 110.0 # should match the `weight` field in the markdown file! Menu sorting order depends on this.
CourseList = ["physik111", "physik112"] # all courses *part of this module*
["..."]
*
#+end_src
Most of these fields are pretty self explanatory. Few are mandatory
specific strings, as mentioned.
The prefix of the field names =mf= stands for "module
field". These names are mapped to proper string names in the language
map, see below.
***** Course TOML file
The course TOML file for a degree follows essentially the same
structure as the module field, with the only difference that the
fields are different and start with a prefix =cf=, "course field".
Again another example from the =BSPHYSIK= (PO 2006) degree:
#+begin_src TOML
CourseList = ["physik131", "math141", "..."] # all courses listed in this degree & in this file
[physik131] # name of the course as it appears in the `CourseList`, starts a TOML sub table
cfTitle = "EDV für Physiker und Physikerinnen" # title of he course
cfNum = "physik131" # name of the course again, as a field
cfCP = "4" # number of credit points for this course
cfWorkload = "1+2"
cfKind = "Vorlesung mit Übungen"
cfCategory = "Pflicht"
cfLanguage = "deutsch"
cfRequirements = ""
cfPreparation = ""
cfFormalities = "Zulassungsvoraussetzung zur Modulprüfung (Abschlussbericht oder Klausur): erfolgreiche Teilnahme an den Übungen"
cfLength = "1 Semester"
cfGoals = "Die Studierenden sollen mit Betriebssystemen vertraut gemacht werden, moderne Editierprogramme kennen lernen, gezielt lernen Webrecherchen durchzuführen und erste Schritte mit einer Programmiersprache machen. Die Lehrveranstaltung ist praxisbezogen und liefert damit eine solide Grundlage für den Umgang mit Rechnern im weiteren Studium"
cfContent = "Betriebssysteme: Linux, UNIX; Editierprogramme: emacs, vi; LaTeX, TeX; Postscript, ghostview, PDF; Algebrasysteme: Maple, Mathematica; Programmiersprache: C++; Plotprogramme: gnuplot, root; shellscripts; Tabellenkalkulation; Web: effiziente Recherchen, Deutung von Webadressen, Einblick in HTML"
cfLiterature = "Es werden kompakte Anleitungen zur Verfügung gestellt"
cfKindShort = "Vorl. + Üb."
cfSemester = "WS" # which semester this course is given in
cfLecturer = "Dozentinnen und Dozenten der Physik und Astronomie"
cfMail = '''exphysik@uni-bonn.de
theophysik@uni-bonn.de
astro@uni-bonn.de
'''
cfOrder = 10.0 # field similar to `mfOrder`, affects ordering in menu & overview table showing courses in table
["..."]
*
#+end_src
***** Language map TOML file
The =mhb__langmap.toml= simply maps the field names found in
the module and course TOML files to properly formatted strings to be
inserted into the HTML.
An example file, again for =BSPHYSIK= (PO 2006):
#+begin_src TOML
Degree = "BSPHYSIK"
[ModuleFields]
mfDegree = "Studiengang"
mfDegreeLong = ""
mfTitle = "Modul"
mfNum = "Modul-Nr."
mfModules = "Module"
mfCP = "Leistungspunkte"
mfCategory = "Kategorie"
mfSemester = "Semester"
mfParts = "Modulbestandteile"
mfCourseTitle = "LV Titel"
mfCourseNum = "LV Nr"
mfCourseKind = "LV-Art"
mfCourseLP = "LP"
mfTotalWorkload = "Aufwand"
mfCourseSemester = "Sem"
mfRequirements = "Zulassungsvoraussetzungen"
mfPreparation = "Empfohlene Vorkenntnisse"
mfContent = "Inhalt"
mfGoals = "Lernziele/Kompetenzen"
mfFormalities = "Prüfungsmodalitäten"
mfLength = "Dauer des Moduls"
mfParticipants = "Max. Teilnehmerzahl"
mfSignup = "Anmeldeformalitäten"
mfNotes = "Anmerkung"
mfKindShort = "LV-Art"
mfOrder = "Reihenfolge"
[CourseFields]
cfTitle = "Lehrveranstaltung"
cfNum = "LV-Nr."
cfCP = "LP"
cfWorkload = "SWS"
cfKind = "LV-Art"
cfCategory = "Kategorie"
cfLanguage = "Sprache"
cfRequirements = "Zulassungsvoraussetzungen"
cfPreparation = "Empfohlene Vorkenntnisse"
cfFormalities = "Studien- und Prüfungsmodalitäten"
cfLength = "Dauer der Lehrveranstaltung"
cfGoals = "Lernziele der LV"
cfContent = "Inhalte der LV"
cfLiterature = "Literaturhinweise"
cfKindShort = "LV-Art"
cfSemester = "Semester"
cfLecturer = "Dozenten"
cfMail = "email"
cfOrder = "Reihenfolge"
cfUseFor = "Verwendung"
#+end_src
As we can see, it simply maps the field maps to nicer names that are
inserted into the corresponding =cf/mf= places in the Hugo
templates.
*** Adding a new module / course by hand
In order to add a new course or module by hand, simply the
corresponding elements in the above data structure has to be added by
hand.
**** Adding a new module
To add a new module, the following steps have to be done:
1. Add the new module name to the =ModuleList= entry in the
=mhb_.toml= file (first line), e.g. =MyNewModule=.
2. Add a new TOML table to the same TOML file by adding a
=[MyNewModule]= with all fields and their appropriate content.
3. Add a new markdown file in
=./content/docs//mhb_/MyNewModule.md=
with the contents as described further up.
Once this is done, rebuilding the site with Hugo should result in a
new module.
**** Adding a new course
Adding a new course is almost the same as adding a new module, but
requires one more step.
1. Add the new course to the =_course_map.toml= file,
adding it to all degree lists in which the course may appear.
2. Add the new course to *all* =mhb__courses.toml= files, in
which the course will appear. If it's for a single degree, only a
single file needs to be modified. In these files add the new course
=MyNewCourse= to the =CourseList= in the first line.
3. Then add a new table =[MyNewCourse]= to each file and fill the
fields as needed.
4. Add new markdown files in
=./content/docs//mhb_/courses/MyNewCourse.md=,
where you just need to make sure to adjust the degree in each
subdirectory accordingly.
After this, a rebuild with Hugo should show a new course.
*** TODO Things still to be explained
- finish the above:
- partial templates: what are they, where are they
- distinguish genCoursePage and genModulePage
- give brief idea about how they work
** PDF generation
PDF generation is done using [[https://pandoc.org][pandoc]] from the final HTML pages generated by
Hugo.
Because these pages contain all sorts of additional information that is not of
interest for the PDF of a module/course. Therefore we extract the == tag
from each page and hand only that to =pandoc=.
To perform the tag extraction we use =xmllint=. For a given HTML file =fname=:
#+begin_src sh
xmllint --nowarning --html --xpath '/html/body/main/div/article' 2> /dev/null
#+end_src
Due to some technically invalid tags (duplicates) in the generated HTML files,
we dump =stderr=.
Furthermore, the default TeX template used by =pandoc= uses a too wide
margin. The =mhb_pandoc_template.latex= file is a slightly modified version of
=pandoc='s default TeX template (which can be [[https://pandoc.org/MANUAL.html#templates][extracted using]] =pandoc -D latex=)
with a margin of 2 cm.
Conversion of a single page is thus:
#+begin_src sh
xmllint --nowarning --html --xpath '/html/body/main/div/article' | \
pandoc -f html -t latex --template mhb_pandoc_template.latex -o .pdf
#+end_src
which will generate an =.pdf= of the module/course.
For the full command to generate all PDFs see the [[./.github/workflows/gh-pages.yml][Github action workflow]].
** Neat features
Just a few features I personally like, cause I would have enjoyed them
back when I was a student or things that could improve the module
handbook in some ways.
- working search functionality (yay!)
- multi language support for the full website, so one could in
principle even offer e.g. the B.Sc. handbooks with English headers or the
M.Sc. handbooks with German headers (and theoretically content of couse)
- KaTeX (similar to MathJax) for inline LaTeX equations (well,
probably not too important for the module handbook, unless something
in the descriptions)
- dark theme
** Some notes
Generate the table of contents from the database when converting to
generate something like the following table of content.
*UPDATE*: We do not even have to manually generate the sub trees. Only
the degrees have to be manually added. The rest is done automagically.
#+begin_src markdown
- [**B.Sc. Physik**]({{< relref "/docs/mhb_bsphysik" >}})
- [**M.Sc. Physics**]({{< relref "/docs/mhb_msphysik" >}})
- [**Module**]({{< relref "/docs/module" >}})
- [Modul 1]({{< relref "/docs/module/modul1" >}})
- [Course 1.1]({{< relref "/docs/module/modul1/course1" >}})
- [Course 1.2]({{< relref "/docs/module/modul1/course2" >}})
- [Modul 2]({{< relref "/docs/module/modul1" >}})
- [Course 2.1]({{< relref "/docs/module/modul1/course1" >}})
#+end_src
where the module / course pages are generated in the same way as the
PDF is currently.
Module / course structure is represented by the directory structure.
Each page has the actual content that is found in the PDF then.
PDF creation can be achieved reasonably well already with a default
=pandoc= call:
#+begin_src sh
pandoc _index.md -o .pdf
#+end_src
will create a single PDF for a module, which looks really good.
See number 14 here:
https://pandoc.org/demos.html
for an idea how we might generate a full handbook for all pages
(e.g. we walk the =content= directory and append the content of each
found markdown file to a single one and use a custom TeX header). With
that we should be able to generate a beautiful PDF, which can then be
injected with things like the page of the overview etc using
=pdfunite, pdftk= or similar.