Digital-Humanities-Quarterly / dhq-journal

DHQ is an open-access, peer-reviewed journal of digital humanities.
http://www.digitalhumanities.org/dhq/
10 stars 5 forks source link

Set up static site generator for DHQ #40

Open amclark42 opened 1 year ago

amclark42 commented 1 year ago

Replace Apache Cocoon's dynamic transformations with static web pages and resources, compiled though an Apache Ant build file. For additional context, see the DHQ infrastructure meeting notes and the specification document.

Tasks:

We still need to decide what to do with the editorial section (which requires authentication) and redirected URLs.

amclark42 commented 1 year ago

The new stylesheet generate_static_articles.xsl already uses the TOC to generate HTML of every DHQ article. It also generates a single XML file, which attempts to map each article's source directory to its expected home upon publication. This mapping needs to be replaced.

In order for Ant to make use of file mapping, the XSLT must generate a new Ant build file, structured like this:

<project name="dhq_articles">

  <target name="copyArticleResources">
    <copy todir="${toDir.path}">
      <fileset dir="${basedir}${file.separator}articles"/>
      <firstmatchmapper>
        <regexpmapper from="^000654/(.*)$" to="vol/17/1/000654/\1" handledirsep="true"/>
        <regexpmapper from="^000116/(.*)$" to="vol/7/2/000116/\1" handledirsep="true"/>
      </firstmatchmapper>
    </copy>
  </target>
</project>

Once the derived build file is available, the main build file can run the task to copy article resources into their static directories:

<ant antfile="..${file.separator}${toDir}${file.separator}article-mapper.xml" 
     target="copyArticleResources" inheritRefs="true"/>
amclark42 commented 11 months ago

The Ant build file in the static_site_generation branch has these main targets:

The default target is previewArticle. All targets rely on XSLT processing, and so, they require the XML resolver JAR to be on the classpath when Ant is called. If the JAR file is missing, the build file will stop and provide instructions for loading the JAR.

To run an Ant target, use this command: ant -lib common/lib TARGET. For example, this would generate a compressed, standalone preview of article 000600:

ant -lib common/lib zipPreviewArticle -Darticle.id=000600

Because this command provides the value of article.id ahead of time, zipPreviewArticle will not prompt for it.

amclark42 commented 11 months ago

Some guidance on testing static site generation:

  1. Optional: run the generateIssues Ant target. (Useful if you want to examine how this task relates to generateSite.)
    • Did any errors occur?
    • Find the dhq-static directory and poke around in it.
  2. Run the generateSite Ant target.
    • Did any errors occur?
    • Find the dhq-static directory.
    • Examine the files in dhq-static/dhq/.
    • Optional: Load the ZIP into a server so that its contents appear at /dhq. This is useful for seeing at a glance if the web assets are in a comparable state to those on DHQ, and for testing some links.
    • If you can’t do this, it’s okay. Some things to keep in mind:
    • You’ll need to spend a little extra time scrolling past the navigation bar to look at the content of the pages.
    • Most links won’t work as-is; it’ll be best to open files using your OS’s file navigation software rather than clicking around.
    • Check a few “static” pages, such as those in the about directory.
    • Compare the articles listed in some DHQ issue’s index with those listed on the DHQ site.
    • Find an article with images, and make sure they display in the HTML.

For Windows testing, be especially attentive to file structure and paths. Are dhq-static and its contents in a reasonable place? Are issue indexes and article HTML files in the right places? If you examine the HTML, do links have forward slashes (/) and not back-slashes (\)? The Windows ZIP should definitely be tested by loading it into a server.

As you’re going through this, think about maintainability and quality-of-life. Could anything be made more transparent, or easier? Are there additional preventative measures that could be taken to head off errors or mistakes?