manubot / rootstock

Clone me to create your Manubot manuscript
https://manubot.github.io/rootstock/
Other
451 stars 177 forks source link

Support Alternate Themes #112

Open Miserlou opened 6 years ago

Miserlou commented 6 years ago

It is difficult to read a long manuscript with the current style settings.

It might be useful to build on the work of other projects which convert Markdown into the usual academic style:

https://github.com/ickc/markdown-latex-css https://github.com/thomaspark/pubcss/ // https://thomaspark.co/project/pubcss/demo/acm-sig-sample-web-theme.html https://gist.github.com/killercup/5917178 etc

dhimmel commented 6 years ago

It is difficult to read a long manuscript with the current style settings.

What exactly do you dislike?

It might be useful to build on the work of other projects which convert Markdown into the usual academic style

Our current github-pandoc.css was based off of this gist.

Regarding the alternatives:

Anyways, the frontend stuff is a bit outside my expertise, so we'd need a contributor to take the lead if we wanted alternative viewers. Another longterm option would be to use a JATS XML viewer, such as Lens Viewer (dependent on JATS export in https://github.com/greenelab/manubot-rootstock/pull/82). Then the webpage could look like this.

Note that currently the way we generate the PDF is to use WeasyPrint to essentially print the webpage to PDF. Ideally, we want nice HTML and PDF output.

dhimmel commented 6 years ago

If you're looking for the most beefy Manubot manuscripts out there, the following are good options:

If a display style works for these manuscripts, that's good as they are long and have lot's of formatting and complexity.

michaelmhoffman commented 5 years ago

I am trying to make a double-spaced PDF of a manuscript for editing without changing the display on the web for others. @vincerubinetti's suggestion in #169 of setting line-height in build/themes/default.html works. I have two issues, however:

  1. There's not an easy way to create e.g. a build/themes/double-spaced.html that inherits any future changes from default.html. Whereas this would be possible if the CSS were split out to a CSS file that could be imported by another CSS file. Please correct me if I'm wrong.
  2. I can't select the theme at the command-line from build.sh as it is hardcoded.
vincerubinetti commented 5 years ago

@michaelmhoffman In the future we'll have some easy way to do this, since line spacing is a pretty common variation.

For now though, you can try this:

Create a new theme file build/themes/double-spaced.html and put * { line-height: 2 !important; } in it. Then, inside build.sh, find where the default.html theme is inserted into the outputted html document. Duplicate that line to also include double-spaced.html in the output. That should give you all of the CSS from the default them AND the double space theme.

michaelmhoffman commented 5 years ago

Would you support moving the CSS to build/themes/default.css? I feel like that would allow modular setting of CSS themes separate from the HTML.

vincerubinetti commented 5 years ago

I'm not sure what you mean. If you look at the theme files, they really are just CSS. The only reason we give them the .html extension is because those files also have to include the surrounding <style> </style> tags, because the files are just inserted verbatim into the output html.

In other words, it already is modular. If you add another theme, it's just like adding more CSS. You'll keep the old CSS rules/classes/etc, and add the new ones.

dhimmel commented 5 years ago

I am trying to make a double-spaced PDF of a manuscript for editing without changing the display on the web for others.

@michaelmhoffman is your ideal behavior:

  1. The HTML display in-browser stays unchanged, but prints to PDF using double spaced lines. This means every PDF output going forward will use double spacing.

  2. A single PDF is created from a single HTML output with double spacing. The goal would be to allow review of one specific version, while keeping future auto-generated PDFs as single spaced?

@vincerubinetti is there a line in default.html that sets the spacing just for the print (i.e. PDF) view? If so and @michaelmhoffman wants option 1, I'd recommend just editing default.html. This will only cause a conflict going forward if the same lines in default.html are edited.

vincerubinetti commented 5 years ago

@vincerubinetti is there a line in default.html that sets the spacing just for the print (i.e. PDF) view?

Yes:

https://github.com/manubot/rootstock/blob/9f16be0807be7299424b5b9d78b609346957d933/build/themes/default.html#L502

Unfortunately because of the nature of CSS and fonts, setting line-height to 2 for example really looks more like 1.5 line spacing for most fonts. So you have to just play with that number until it looks like the spacing you want. Double spacing will probably need around line-height: 2.5.

michaelmhoffman commented 5 years ago

If the theme were in a CSS file that we could easily @import from another CSS it would be easy to make a drop-in replacement for default.html that inherits from it. Specifying the theme for build.sh could be done quite easily by setting a THEME_PATH variable within build.sh and replacing any use of build/themes/default.html with "$THEME_PATH". We could then set THEME_PATH either from the environment or from a command-line option to build.sh. As easy change.

The only way to inherit now is to add an additional --include-after-body= in two different places. But you don't want the --include-after-body= if you aren't using a user theme. This is fine (if slightly awkward and DRY-violating) for hard-coding the additional user theme in build.sh, but makes the code quite a bit more complex if you wanted to make the user theme an option a user can transiently set. Which I would find worthwhile as a collaborator on a document where my reviewing preferences are different (double-spaced) for what one would want for the final document (single-spaced).

dhimmel commented 5 years ago

@michaelmhoffman would you have the time to open a PR that does this? It would help us evaluate to see the implementation. You could even create a stub PR with the build.sh changes and then @vincerubinetti and I can do any extra changes to CSS / HTML / documentation that would be necessary.

michaelmhoffman commented 5 years ago

Stub PR would be no problem.

Nebucatnetzer commented 4 years ago

I've got a general question about the default theme. I personally find the default look of the PDF very hard to read, I know that it recently got compacted. IMO that made the readability much worse. Shouldn't the PDF be easy to read and maybe even look good? Some things from here would help a lot with this problem: https://practicaltypography.com/typography-in-ten-minutes.html https://practicaltypography.com/summary-of-key-rules.html

Maybe even a serif font would be nice however I know that some people don't like them.

vincerubinetti commented 4 years ago

This good feedback and good references. I think we'll want to have several themes in the (hopefully near) future that make variations on font style and level of compactness, as people have different preferences. Personally I wasn't a fan of the compactification, but I understand why other people wanted it, for papers that could be 50 pages instead of 90.

Perhaps we could have different "modes" that are kinda separate from themes that switch things like compactness.

dhimmel commented 4 years ago

I personally find the default look of the PDF very hard to read, I know that it recently got compacted. IMO that made the readability much worse.

Thanks for weighing in. I guess different users will prefer different PDF styles. We'll work on a larger more readable option for printing.

I also wanted to note paged.js, which may be useful for converting HTML manuscripts to PDF with things like page numbers. I saw PubPub is using it.

Nebucatnetzer commented 4 years ago

Oh their page looks really nice. Reminds me a bit of this: https://edwardtufte.github.io/tufte-css/

Nebucatnetzer commented 4 years ago

Would it be possible to split the CSS into multiple files and include them in the HTML file? Maybe something like this:

michaelmhoffman commented 4 years ago

+1 for modularizing the style stuff a bit.

vincerubinetti commented 4 years ago

I think we'll definitely split the CSS up into multiple files, but as to how we make that split will have to be chosen extremely carefully to avoid causing a lot of strife down the road with rules overlapping and overriding each other unexpectedly.

As far as moving them to their own file, I'm not sure exactly what you mean. If you mean hot-linking to some other location, say on Manubot servers or repos, we want to avoid doing that because we want exported papers to be completely self contained; we want them to have everything they need even if external resources go down (with the special exception of Hypothesis).

We can consider minifying the javascript or CSS in some way to make the output html file more manageable. I agree it's quite versbose and hard to read if you need to edit its source code. As far as actual file size goes though, it's just a text file, and I don't think we necessarily need to optimize for that.

Nebucatnetzer commented 4 years ago

I think we'll definitely split the CSS up into multiple files, but as to how we make that split will have to be chosen extremely carefully to avoid causing a lot of strife down the road with rules overlapping and overriding each other unexpectedly.

I agree with this but think that splitting the CSS might actually help with that. All the options which apply to all parts go to "general.css" or to "extension-name.css" everything else goes into its specific file.

As far as moving them to their own file, I'm not sure exactly what you mean. If you mean hot-linking to some other location, say on Manubot servers or repos, we want to avoid doing that because we want exported papers to be completely self contained; we want them to have everything they need even if external resources go down (with the special exception of Hypothesis).

I mean using them just like this:

<link rel="stylesheet" type="text/css" href="general.css">
<link rel="stylesheet" type="text/css" href="extension-name.css">
<link rel="stylesheet" type="text/css" href="pdf.css">

I absolutely agree that we should avoid external dependencies as much as possible.

We can consider minifying the javascript or CSS in some way to make the output html file more manageable. I agree it's quite versbose and hard to read if you need to edit its source code. As far as actual file size goes though, it's just a text file, and I don't think we necessarily need to optimize for that.

With large I meant long and therefore hard to read/understand. Splitting it into logical parts would already be enough to help with that.
Minifying would only make it more complex to manage wouldn't it?

vincerubinetti commented 4 years ago

I agree with this but think that splitting the CSS might actually help with that. All the options which apply to all parts go to "general.css" or to "extension-name.css" everything else goes into its specific file.

Yes we could do that, and it would help with people being able to find the CSS they're looking for. The overlap I was referring to is CSS-related. Since they're all ending up in the same html file, CSS class and id names can conflict. .button class in lightbox-plugin.css file might overwrite certain properties in a .button class defined in global.css. Whether a style gets overwritten is based on CSS specificity, which in my experience is very unpredictable and unintuitive, especially for people who have never worked with CSS. I'll have to rewrite a lot of the CSS and consider the splits extremely carefully so that when a user edits something, it works just how they intuitively expect it to.

I mean using them just like this:

Okay, so you're talking about splitting them out into separate files and keeping them locally along with the outputted .html file. If I'm building my own website, I always split things out like this, and most people do when working with html.

I originally didn't want to do this because if the user wanted to move the manuscript somewhere (like put it on their personal site or something), they'd have to remember to copy all the extra dependencies (and some people may not even know what CSS and javascript are). Whereas if it was a single bundled .html file, that's all you would need. You would still need to copy over images, but I think people would have an easier time understanding that everything is bundled except images because they're big.

Another thing was, all the other output formats (pdf and docx) are single files, and it was nice to keep the output folder clean. If they were to see a list of files like manuscript.pdf, manuscript.docx, manuscript.html, default-theme.css, table-of-contents.js, etc, they might be confused as to what those extra things are that they've never seen before (we always have to consider first time users), and what they belong to.

However maybe we can make the output folder look like this instead:

- /html
    - manuscript.html
    - /images
    - /assets
        - theme.css
        - plugin.js
- /pdf
    - manuscript.pdf
- /docx
    - manuscript.docx
Nebucatnetzer commented 4 years ago

Yes we could do that, and it would help with people being able to find the CSS they're looking for. The overlap I was referring to is CSS-related. Since they're all ending up in the same html file, CSS class and id names can conflict. .button class in lightbox-plugin.css file might overwrite certain properties in a .button class defined in global.css. Whether a style gets overwritten is based on CSS specificity, which in my experience is very unpredictable and unintuitive, especially for people who have never worked with CSS. I'll have to rewrite a lot of the CSS and consider the splits extremely carefully so that when a user edits something, it works just how they intuitively expect it to.

Are there many things specified multiple times? I hadn't yet time to fully look through the theme. If there aren't many items which get specified multiple times then it shouldn't be that much of a problem. Otherwise we have to be careful yes. Besides overwriting things could happen in the single HTML file as well. Who can keep track of 1200 lines? At least when I code in Python I often forget old imports or variables, I reckon that could happen easily as well with CSS. Anyway, I admit that my experience with CSS and design is limited so I'm by no means an expert. More someone who tries to understand how things are working in manubot :).

I originally didn't want to do this because if the user wanted to move the manuscript somewhere (like put it on their personal site or something), they'd have to remember to copy all the extra dependencies (and some people may not even know what CSS and javascript are). Whereas if it was a single bundled .html file, that's all you would need. You would still need to copy over images, but I think people would have an easier time understanding that everything is bundled except images because they're big.

Maybe we could compile one large HTML file with everything in it? AFAIK it's even possible to include images in HTML files and this way we could've the raw files split for humans to get their head around and for releases the script builds the website into a single file.

However maybe we can make the output folder look like this instead:

- /html
    - manuscript.html
    - /images
    - /assets
        - theme.css
        - plugin.js
- /pdf
    - manuscript.pdf
- /docx
    - manuscript.docx

Would look nice as well and is easy to understand for someone with basic knowledge of web development.

vincerubinetti commented 4 years ago

Are there many things specified multiple times? I hadn't yet time to fully look through the theme.

Right now no. This would be after our attempts to make separate themes and variations on those themes, like compact print. We could make our themes totally self-complete, meaning you're only supposed to use 1 at a time. This would lead to a lot of code duplication though, as a lot of the styles from a "contemporary" serif-type theme would probably be the exact same as the current default theme. Plugin .css also makes this more complicated.

It might be not as bad as I'm imagining, I'm just saying it has to be considered. But I'll work on it and we'll see how it goes.

EDIT* I want to clarify, if I (or just one person) was the only one ever touching any CSS, it wouldn't be a problem to make sure there's no conflicts. The main issue is making it such that if a user ever writes their own CSS rules/styles, they behave as expected, as much as possible.

At least when I code in Python I often forget old imports or variables, I reckon that could happen easily as well with CSS.

Unfortunately the land of HTML and CSS is much worse than a typical programming, especially with regard to the scoping of things. As the designated person to suffer through this, I'll handle it as I'm building it. But I suppose it's something regular users and other developers should at least be aware of.

AFAIK it's even possible to include images in HTML files and this way we could've the raw files split for humans to get their head around and for releases the script builds the website into a single file.

Yeah, for those who don't know, you can embed essentially anything in an HTML document by converting it to a data URI, a long string of base64 encoded data. This would definitely make the resulting HTML file harder to read, though. But it would be 100% self-contained. Also if we strongly encourage people to make their figures as .svg, which they should be doing, the verbosity of each embedded image wont be that bad at all.

Depends if we want to have a nice readable output html, or a full self-contained html. I would recommend we choose one or the other, but do it fully that way, not half-way of each. @dhimmel thoughts?

Nebucatnetzer commented 4 years ago

It just came to my mind that it would be nice if manubot had a namespace in the CSS. To test the background colours in the table I used IDs to identify the cells and later realized that I probably should be using classes for that. However it would have collided with the colours from Manubot because there were already some classes called green, yellow, etc. I reckon it would help in the long run if they would be called manubot-green, etc. This would probably help with the theming as well, at least a bit.