Open chrishtr opened 3 years ago
@cycomachead WDYT of this approach? It:
Markdown has excellent integration with github and github pages, and is easier to read and write than HTML.
Using Markdown will make it a lot easier to refactor the common loader and stylesheet code to be separate from content. I can also try a batch migration of all the content Markdown with commands like:
pandoc --from html --to markdown --fail-if-warnings course.md --metadata title="title" --metadata-file=config.json course/cs10_sp21.html
(note from is html and to is markdown)
Thanks Chris for working on this! This is pretty interesting, but I do have a couple of primarily usability concerns, and one maintainability one. But, honestly this is pretty close to something I think we could work with.
Usability:
$
as a delimiter is a bit challenging. We use KaTeK in places which uses $$
, and as you've seen there's a fair number of terminal prompts. I'm pretty reticent to writing $$()
as jquery, but the solution there would be to just not use pandoc variables in JS files.
{{ }}
as a delimiter? I would be open to a pandoc filter that preprocesses {{
, }}
, and $
into the right format before the actual compiling.Maintainability I'm somewhat concerned about a makefile based approach. We kind of backed ourselves into a corner on the Snap! frontend, and are now undoing it. In the future if we wanted to do something like pre-compile the dropdown, or even compile each "course", it seems like we might be building a lot ourselves (though, that's up from zero today...just wondering aloud if maybe there's something off the shelf we might consider).
Otherwise, are there any "build systems" sitegens that exist using Pandoc? I'm not against pandoc as tool -- we might even start to rely on it for conversion of pages to individual word documents.
Maybe this is some useful context, and where I am thinking about some potential opportunities: Right now, most sites are deployed directly to GitHub pages (this repo) and to a purely static apache server (bjc-edc/bjc-r). Hopefully sooner than later, we'd add S3 into the mix as a deployment. We now have projects which should share the same structure, one with a submodule and one (bjc-edc) which was "un-submoduled" at some point. It would be great if any of this made it easier to share/version a little more easily. Not sure what's best. Recently, a bit more development has happened over on bjc-edc/bjc-r since that's been the current focus of a research project.
Two other enhancements that would be nice at some point:
/bjc-r
requirement, though I think any sort of config file with a baseURL
gets us there pretty much.Using Markdown will make it a lot easier to refactor the common loader and stylesheet code to be separate from content. I can also try a batch migration of all the content Markdown with commands like:
pandoc --from html --to markdown --fail-if-warnings course.md --metadata title="title" --metadata-file=config.json course/cs10_sp21.html
I think I want to delay a wholesale conversion to markdown for a bit, but it's definitely on the list of things that could make writing easier. One thing we've been mulling over is whether reST might be better, which is at least an easy option with pandoc. though markdown certainly is an option still
Thanks Chris for working on this! This is pretty interesting, but I do have a couple of primarily usability concerns, and one maintainability one. But, honestly this is pretty close to something I think we could work with.
Excellent, glad to hear it. Let's keep iterating!
Usability:
- Today, we don't need a build step, so you can run a server and see live changes. So, we might need some kind of a file watcher to compile changes? Though, maintaining our own seems like not quite the right approach.
In almost all cases you don't actually need to run the compile, as all it does is optimize the style sheets. However, I do think a compile step is required in order to provide the "developer ergonomics", and performance, at the same time. A file watcher could be implemented, but IMO the compile step is not a big problem. Authors should focus on the content most of the time, and occassionally run a compile (which is not very slow as I've implemented it.) A file watcher could potentially be added.
$
as a delimiter is a bit challenging. We use KaTeK in places which uses$$
, and as you've seen there's a fair number of terminal prompts. I'm pretty reticent to writing$$()
as jquery, but the solution there would be to just not use pandoc variables in JS files.
I agree that the $ part is a bit annoying.
- Could we use
{{ }}
as a delimiter? I would be open to a pandoc filter that preprocesses{{
,}}
, and$
into the right format before the actual compiling.
I could add in a sed script that replaces $ with $$. And lint check that no more $s are added beyond a whitelisted set of files. Another thing to do can be to just change to a different unix prompt that doesn't use $, and not use jquery. :)
- Or, it seems like mustache, jinja, liquid or another template tool might be a little easier.
I guess? pandoc seems directly suited what this project wants though - a way to convert content in a one-off way for static deployment on a website. I have no experience with jinja or liquid, but they seem to be meant for dynamic web apps (at least liquid claims that).
Maintainability I'm somewhat concerned about a makefile based approach. We kind of backed ourselves into a corner on the Snap! frontend, and are now undoing it. In the future if we wanted to do something like pre-compile the dropdown, or even compile each "course", it seems like we might be building a lot ourselves (though, that's up from zero today...just wondering aloud if maybe there's something off the shelf we might consider).
I'm not sure I see what downside you're worried about.
Otherwise, are there any "build systems" sitegens that exist using Pandoc? I'm not against pandoc as tool -- we might even start to rely on it for conversion of pages to individual word documents.
Probably? I only learned about pandoc while writing https://browser.engineering. It seems like a pretty simple and efficient tool for this kind of job. I am very happy with it for that purpose. In fact, it has easily supported embedded HTML widgets we added recently (for markdown, but still).
Maybe this is some useful context, and where I am thinking about some potential opportunities: Right now, most sites are deployed directly to GitHub pages (this repo) and to a purely static apache server (bjc-edc/bjc-r).
It does appear that the only way to do non-fully-static builds with gh-pages right now is via Jekyll. I have never used Jekyll though, do you think that's a good solution?
Hopefully sooner than later, we'd add S3 into the mix as a deployment.
Just curious: is this for faster load times?
We now have projects which should share the same structure, one with a submodule and one (bjc-edc) which was "un-submoduled" at some point. It would be great if any of this made it easier to share/version a little more easily. Not sure what's best. Recently, a bit more development has happened over on bjc-edc/bjc-r since that's been the current focus of a research project.
What kind of sharing do you need? mixing pages from multiple sources into one site? If that's the need, can you satisfy it by making multiple sites and linking them to each other?
Hi, WDYT? I'm open to suggestions.
In almost all cases you don't actually need to run the compile, as all it does is optimize the style sheets.
The current files aren't really usable without compiling; you really do need the CSS to read the page as users see it. The workflow for a lot of the folks working on curriculum is to write a bit and view it in the browser. So, ideally something would watch files and also serve www/
. (Self containing things in a subdirectory would be good. I typically run a process one directory up, which is mildly annoying.)
Plus, in a way, I think if we can assume that everything is built, then we can drop the <head>
and <body>
boilerplate and focus on content. (And then markdownify things)
I could add in a sed script that replaces $ with $$. And lint check that no more $s are added beyond a whitelisted set of files. Another thing to do can be to just change to a different unix prompt that doesn't use $, and not use jquery. :)
jQuery may go one day, but c'mon that's separate. :) And while I'm a fan of Paul Irish's 'you might not need query' and the like, it still has its place. :) As for the prompt, my prompt is personally a 👉, but $
is the standard on most lab machines and I guess one day %
will be the better choice now that zsh is the default, but still finding the vast majority of students in bash-land.
I'm not sure I see what downside you're worried about.
My experience tends to be that Makefiles get more complex over time, and the number of people who are great with them are slim. There's nothing that stands out as terribly complex right now, though I fully admit it would have taken me much longer to write the same makefile (I assume). So, this may be more preference than anything.
Probably? I only learned about pandoc while writing https://browser.engineering. It seems like a pretty simple and efficient tool for this kind of job. I am very happy with it for that purpose. In fact, it has easily supported embedded HTML widgets we added recently (for markdown, but still).
Cool! That's a great example, and keep linking to the book. It's a good reminder to read through it. Do you have any pages where you'd consider there to be substantially different layouts? (e.g. we have student-facing and "teacher-facing" pages which might be worth having a more clearly distinct layout.
The source makes some clever use of Lua plugins, which seem like a fairly reasonable path for customization. (I could imagine things in the future like specifying width
and height
on images, and building some more of the HTML that's still built by jQuery)
It does appear that the only way to do non-fully-static builds with gh-pages right now is via Jekyll. I have never used Jekyll though, do you think that's a good solution?
I think after a few days, I'm more keen to try pandoc. I like the looks of the repo for the book. Just for reference https://bjc.berkeley.edu ( https://github.com/beautyjoy/beautyjoy.github.io ) is a jekyll site -- and so far it works fine for that purpose.
The main thing is that we get an auto-build, but for bjc-r we could use a GitHub Action (or other CI) that builds the site on merge and pushes to a branch.
Just curious: is this for faster load times?
S3 is primarily to get us out of the business of running a web server as much as possible. (We've got things deployed right now on a "semi-managed" Apache instance.. and it's becoming a pain to do things like debug caching, gather logs, etc)
What kind of sharing do you need? mixing pages from multiple sources into one site? If that's the need, can you satisfy it by making multiple sites and linking them to each other?
Well we also have bjc-edc/bjc-r and at this point the content is pretty separate, and there's a couple of (separate) projects which may have forked that repository...but not necessarily with the intention of upstreaming changes. I guess that's partially why llab/
was intended to work as a submodule, though that flow has its complications, too.
In almost all cases you don't actually need to run the compile, as all it does is optimize the style sheets.
The current files aren't really usable without compiling; you really do need the CSS to read the page as users see it. The workflow for a lot of the folks working on curriculum is to write a bit and view it in the browser. So, ideally something would watch files and also serve
www/
. (Self containing things in a subdirectory would be good. I typically run a process one directory up, which is mildly annoying.)
loader.js will load those style sheets, so all that's needed is to inline loader.js. I agree that this prototype is slightly hacky though (also the $$ comment below..)
jQuery may go one day, but c'mon that's separate. :) And while I'm a fan of Paul Irish's 'you might not need query' and the like, it still has its place. :) As for the prompt, my prompt is personally a 👉, but
$
is the standard on most lab machines and I guess one day%
will be the better choice now that zsh is the default, but still finding the vast majority of students in bash-land.
(The jquery comment was just a joke :))
Probably? I only learned about pandoc while writing https://browser.engineering. It seems like a pretty simple and efficient tool for this kind of job. I am very happy with it for that purpose. In fact, it has easily supported embedded HTML widgets we added recently (for markdown, but still).
Cool! That's a great example, and keep linking to the book. It's a good reminder to read through it. Do you have any pages where you'd consider there to be substantially different layouts? (e.g. we have student-facing and "teacher-facing" pages which might be worth having a more clearly distinct layout.
Different layouts for the same markdown page, you mean?
The source makes some clever use of Lua plugins, which seem like a fairly reasonable path for customization. (I could imagine things in the future like specifying
width
andheight
on images, and building some more of the HTML that's still built by jQuery)
Yep, agreed.
It does appear that the only way to do non-fully-static builds with gh-pages right now is via Jekyll. I have never used Jekyll though, do you think that's a good solution?
I think after a few days, I'm more keen to try pandoc.
Cool. How about something like converting content incrementally over to Markdown?
Plan of record, unless I hear otherwise: I think I'll make a PR that converts exactly one HTML file to markdown, plus a Makefile similar to what is in this PR, plus a similar deploy mechanism to the www/ directory.
This PR does the following:
Add a Makefile, with one target: "site". "make site" will copy all content from bjc to a www/ directory, and inject stylesheets and scripts into the head for any files that have the right template markers. Other files will be untouched.
Modifies one example to demonstrate the concept of injecting via a template (cs10_sp21.html).
Fixes a few instances where the $ special template character appeared in existing content.
When deploying to the live site, the site should be rooted at the www/ directory.
This PR is similar to https://github.com/beautyjoy/bjc-r/pull/800, except that it uses a templated way of injecting the scripts, plus a new deployment directory, rather than modifying in-place.
If this PR were landed, subsequent steps would be:
<head>
section.Further future changes could include migrating content to markdown gradually and introducing more pandoc templates to render them into HTML.