HTTPArchive / almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community
https://almanac.httparchive.org
Apache License 2.0
613 stars 174 forks source link

Streamline figures in markdown #1138

Closed rviscomi closed 3 years ago

rviscomi commented 4 years ago

I'd like to propose a change in the way figures are described by authors in the markdown and rendered internally.

image

In css.md, the corresponding markdown (using embedded HTML) for the figure above is:

<figure>
  <a href="/static/images/2019/css/fig12.png">
    <img src="/static/images/2019/css/fig12.png" alt="Figure 12. Adoption of flexbox." aria-labelledby="fig12-caption" aria-describedby="fig12-description" width="600" height="371" data-width="600" data-height="371" data-seamless data-frameborder="0" data-scrolling="no" data-iframe="https://docs.google.com/spreadsheets/d/e/2PACX-1vQO5CabwLwQ5Lj1_9bbEFnFM1qEqCorymaBHrcaNiMSJ7sYDKHUI5iish5VAS-SxN447UTW-1-5-OjE/pubchart?oid=2021161093&amp;format=interactive">
  </a>
  <div id="fig12-description" class="visually-hidden">Bar chart showing 49% of desktop pages and 52% of mobile pages using flexbox.</div>
  <figcaption id="fig12-caption">Figure 12. Adoption of flexbox.</figcaption>
</figure>

One thing I'd like to improve is the figure numbering. The figure ID is automatically generated, but we're still manually writing Figure 12 in the <figcaption> and managing the IDs needed for the accessible description. Can all of this be automated? I'd also like to make all figure numbers unique and prefixed by chapter number. So rather than Figure 12 this would be Figure 2.12.

The fig12.png image shouldn't be named according to its figure number. These names aren't descriptive and must be kept in sync with the figure ordering in the chapter. Instead, we should name figure images descriptively, like flexbox-adoption.png. We should use this name as the <figure id> and permalink so that reordering figures wouldn't make old links obsolete.

I would also love to see all figures accompanied by links to their metadata like the corresponding SQL file and results sheet. This would enable anyone to see how the metrics were calculated and run it themselves. This requires some design choices like how to provide these links unobtrusively and accessibly. My idea is a menu in the corner of the figure, similar to the HTTP Archive metrics:

image

Rather than generating figure numbers in the build process, I wonder if this could be done entirely in the templates. For example, could authors include a Jinja macro like this:

{%
  figure(
    # Figure ID corresponding to the `png` image and `sql` file name.
    'flexbox-adoption',
    # Figure caption.
    'Adoption of flexbox.',
    # Detailed figure description.
    'Bar chart showing 49% of desktop pages and 52% of mobile pages using flexbox.',
    # Embedded Sheets URL. (maybe we can add the base of this URL to the chapter yaml and only include the relevant IDs?)
    'https://docs.google.com/spreadsheets/d/e/2PACX-1vQO5CabwLwQ5Lj1_9bbEFnFM1qEqCorymaBHrcaNiMSJ7sYDKHUI5iish5VAS-SxN447UTW-1-5-OjE/pubchart?oid=2021161093&amp;format=interactive',
    # Tab ID in Sheets for this metric's results. For example `/edit#gid=1861654265`.
    '1861654265',
    # Optional width and height if non-standard (600x371).
    600, 371
  )
%}

There are other kinds of figures, like big numbers:

image

<figure>
  <div class="big-number">2%</div>
  <figcaption>Figure 13. Percent of websites using grid.</figcaption>
</figure>

Even though the markup for this figure is much simpler, I would love to see another macro to standardize the boilerplate across all chapters so that the developers have more centralized control over how these are generated. We could have a macro like this:

{%
  figure_big_number(
    # Figure ID corresponding to the `sql` file name.
    'grid-adoption',
    # The big number.
    '2%',
    # Figure caption.
    'Percent of websites using grid.',
    # Tab ID in Sheets for this metric's results.
    '1459448594'
  )
%}

We should build some customizability into these macros. For example, we needed to adjust the appearance of the z-index "really big number" figure. Perhaps there could be an optional parameter for a classname to be added to the figure. This could be used to adjust the font size as in the z-index example, create "alternate" themes for the big numbers so that they're not all the same color, etc.


We also need to support other types of figures: tables, images, and videos. The 2019 Mobile Web chapter uses all three in the first three figures.

<figure>
<table>
  <tr>
    <th>Connection type</th>
    <td><a href="https://www.gsma.com/r/mobileeconomy/">2G or 3G</a></td>
  </tr>
  <tr>
    <th>Latency</th>
    <td>300 - 400ms</td>
  </tr>
  <tr>
    <th>Bandwidth</th>
    <td>0.4 - 1.6Mbps</td>
  </tr>
  <tr>
    <th>Phone</th>
    <td><a href="https://www.gsmarena.com/samsung_galaxy_s6-6849.php">Galaxy S6</a> — <a href="https://www.notebookcheck.net/A11-Bionic-vs-7420-Octa_9250_6662.247596.0.html">4x slower</a> than iPhone 8 (Octane V2 score)</td>
  </tr>
</table>
<figcaption>Figure 1. Technical profile of a typical mobile user.</figcaption>
</figure>

I think it's ok to include the <figure> HTML although it may be nice to use markdown tables when possible. It may be harder to get the <th> elements right for both horizontal and vertical using only markdown. But we should have a solution to abstract away the figure numbering and IDing. For example:

<figure id="technical-profile">
<table>
  <tr>
    <th>Connection type</th>
    <td><a href="https://www.gsma.com/r/mobileeconomy/">2G or 3G</a></td>
  </tr>
  <tr>
    <th>Latency</th>
    <td>300 - 400ms</td>
  </tr>
  <tr>
    <th>Bandwidth</th>
    <td>0.4 - 1.6Mbps</td>
  </tr>
  <tr>
    <th>Phone</th>
    <td><a href="https://www.gsmarena.com/samsung_galaxy_s6-6849.php">Galaxy S6</a> — <a href="https://www.notebookcheck.net/A11-Bionic-vs-7420-Octa_9250_6662.247596.0.html">4x slower</a> than iPhone 8 (Octane V2 score)</td>
  </tr>
</table>
<figcaption>{{ figure_number() }} Technical profile of a typical mobile user.</figcaption>
</figure>

I've added id="technical-profile" to the <figure> and {{ figure_number() }} to the <figcaption>. The figure_number Jinja macro could generate a monotonically increasing figure number using shared state with the other figure macros. I'd love to hear any ideas to remove even more of the figure/figcaption boilerplate.


Summary

cc @HTTPArchive/developers

logicalphase commented 4 years ago

I like it, @rviscomi as described, agree that additional meta would provide better UX. I was working on a similar scheme fairly recently. I'll go back and see if there's anything that might be useful in meeting your objectives.

tunetheweb commented 4 years ago

Thanks for raising and definitely some improvements we could do here before we generate 2020 figures.

A few comments:

The figure ID is automatically generated, but we're still manually writing Figure 12 in the

and managing the IDs needed for the accessible description. Can all of this be automated?

This can certainly be automated to bring consistency.

I'd also like to make all figure numbers unique and prefixed by chapter number. So rather than Figure 12 this would be Figure 2.12.

Yes this probably makes sense. Will think on it some more.

The fig12.png image shouldn't be named according to its figure number. These names aren't descriptive and must be kept in sync with the figure ordering in the chapter. Instead, we should name figure images descriptively, like flexbox-adoption.png.

100% agree!

We should use this name as the

and permalink so that reordering figures wouldn't make old links obsolete.

Less in agreement with this. I do want to avoid making old links obsolete (or worse - point to wrong data) but not convinced moving away from numeric id's is the answer. Especially as the caption will likely still include the id as I think the text should be able to reference the figure number (e.g. "we can see in figure XXX that...").

One thing to be aware of is that we have in the past temporary removed figures resulting in changing automatically generated figure numbers, so need to be cautious that any automatic numbering is likely still to cause an issue here. And I think that could easily be missed these days since we automate the HTML generation so it's not part of the pull request anymore.

Another thing to be aware of is that the 2019 Third Parties chapter doesn't have a figure 6.

So I actually think the id should be set when the figure is inserted by author or analyst and shouldn't be automatically generated. Granted this means some re-ordering if new figures are added (e.g. large figures added during copy editing to break up long pieces of text) or removed, but that should be rare after publication.

I would also love to see all figures accompanied by links to their metadata like the corresponding SQL file and results sheet. This would enable anyone to see how the metrics were calculated and run it themselves. This requires some design choices like how to provide these links unobtrusively and accessibly. My idea is a menu in the corner of the figure, similar to the HTTP Archive metrics:

I like this! It would require using ids for the SQL filename (which I don't like for same reason as we don't want to use them for image alt attributes), having a data-sql attribute (a little extra effort), or using same name for SQL and image. Think either of last two options are do-able and maybe the data-sql attribute gives us most flexibility and allows linking the same SQL to two figures which will probably be needed (e.g. 2019 CDN chapter gives charts and tables of same data).

Rather than generating figure numbers in the build process, I wonder if this could be done entirely in the templates. For example, could authors include a Jinja macro like this:

Yes this is possible, and probably more robust than using jsdom during chapter generation. I would however give the figure id explicitly for reasons discussed above (also makes the numbering easier - especially if using different macros for big figures or none at all for tables for example).

I think it's ok to include the

HTML although it may be nice to use markdown tables when possible. It may be harder to get the elements right for both horizontal and vertical using only markdown. But we should have a solution to abstract away the figure numbering and IDing.

Yes markdown tables are very limited so only use them for simple tables. Given my preference not to abstract the numbers , I'd suggest very simple macros in the language templates to allow for easy of translation:

<figure id="{{ figure_id(1) }}" >
...
<figcaption>{{ self.caption(1, "Technical profile of a typical mobile user.") }}</figcaption>

The {{ figure_id(1) }}`` macro may not even be needed but allows to ensure consistency betweenfig-1andfig1` which we've failed to do in the past.

Or, if do want to move away from numeric ids, then need to do this to allow the link, but can see that's already repetitive:

<figure id="technical-profile" >
...
<figcaption>{{ self.caption(1, "technical-profile", "Technical profile of a typical mobile user.") }}</figcaption>

Summary

Figure markup is a dark art that should be abstracted away so that authors can focus on the content.

Agreed!

Numeric figure IDs are brittle and should be replaced with descriptive, human readable IDs.

Disagree.. They will still be brittle as long as we use them in Descriptions and text (which I think we should!). Human readable IDs will require more effort (could reuse image filenames, but not all figures have images) and require repeating to get links working.

Figure numbers should be entirely automated.

Disagree.

We should include more meta resources with figures to connect the results back to the source.

Agreed!

rviscomi commented 4 years ago

To clarify, I'm not suggesting any changes to the way the figures appear other than prefixing with the chapter number, eg the figcaption will still say Figure 2.12. When I say "figure ID" I'm referring to the human readable name which we can reuse as the figure's anchor ID, SQL file name, and ARIA attribute plumbing.

tunetheweb commented 4 years ago

Yeah I get that. And do see benefit of a non-changing anchor id to avoid link rot, but see the following problems:

tunetheweb commented 4 years ago

Oh and “human readable name” isn’t great for translations. Already a problem for image named admittedly but they often are in English, and aren’t really exposed to readers anyway - like figure ids would be in deep link URLs.

rviscomi commented 4 years ago

One thing I may have communicated poorly is that I'm proposing figures' sequential number to be completely taken out of the markdown. Authors shouldn't have to count figures to determine what their number will be and hardcode that number in the document. What doesn't change is the semantic meaning of the figure, eg flexbox-adoption, so this is the best way to make the figures directly addressable.

The problems you described could be solved with templating*. For example, the template could map figure names (flexbox-adoption) to figure numbers (2.12) determined dynamically based on the order of the figures at runtime. If an author wants to direct readers to figure 2.12, they could invoke a template function like {{ figure_ref('flexbox-adoption') }}, which would render a link like this <a href="#fig-flexbox-adoption">Figure 2.12</a>. (Note I added the prefix fig- here to address your point about heading ID conflicts).

* Maybe not entirely with templating. If a figure is referenced before it's defined, I'm not sure that the template would already have its named mapped to the number. Need to think on this some more, but the compilation step may still be needed to scan through all figures first.

Reusing the same SQL for multiple figures is a real concern, albeit not the common case. I think the templates should assume the common case where the figure name is the same as the SQL file name. If two figures need to map back to the same SQL file, we can support optional kwargs in the template macro. For example, if the flexbox and grid figures were queried in the same file:

{%
  figure(
    # Figure ID corresponding to the `png` file name.
    'flexbox-adoption',
    # Figure caption.
    'Adoption of flexbox.',
    # ...
    # Optional: ID corresponding to the `sql` file name.
    sql_file='flexbox-grid'
  )
%}
# [...]
{%
  figure(
    # Figure ID corresponding to the `png` file name.
    'grid-adoption',
    # Figure caption.
    'Adoption of grid.',
    # ...
    # Optional: ID corresponding to the `sql` file name.
    sql_file='flexbox-grid'
  )
%}

The same could be done for translations, although I don't think this is necessary. For example, we don't translate the page names in the URL, like the chapter title, contributors, or methodology, so untranslated anchor links are not a new concern. If needed, the macro could support a translation-specific localized_id kwarg:

{%
  figure(
    # Figure ID corresponding to the `png` and `sql` file names.
    'flexbox-adoption',
    # Figure caption.
    'Adopción de flexbox.',
    # ...
    # Optional: Localized ID that overrides the figure ID in anchor names.
    localized_id='flexbox-adopción'
  )
%}

(I assume macros support kwargs, but worst case if not we'd use different macros for each scenario)

@bazzadp WDYT?

tunetheweb commented 4 years ago

Yes if we could solve the way of referencing the figures then I agree the numbering is less important and then there is real merit in getting rid of them completely.

Not sure how to do this technically though but will have a play. May have to be a combination of jsdom code at generation and Jinja templating.

Also like you’re idea of prefixing their if with fig- to avoid name clashes.

Localisation could be solved with the id being set in the markdown as you say.

@bazzadp WDYT

You’re starting to win me round with all these counter arguments! Now just need to think about how to do it! 😀

ibnesayeed commented 3 years ago

Moving discussion from https://github.com/HTTPArchive/almanac.httparchive.org/pull/1589#issuecomment-735402716.

I would suggest we make the Show description of Figure N.M less prominent. Instead of giving this repetitive phrase a line after each figure, we could just add an icon next to the figure caption for the curious ones. However, if the description is something would would like to show everyone (not just for the sake of accessibility), then it can be expanded by default on large screens.

tunetheweb commented 3 years ago

I would suggest we make the Show description of Figure N.M less prominent. Instead of giving this repetitive phrase a line after each figure, we could just add an icon next to the figure caption for the curious ones. However, if the description is something would would like to show everyone (not just for the sake of accessibility), then it can be expanded by default on large screens.

My vote would be to keep the button as is. The point of this is to make it available for visually impaired readers (not all of whom will be using screen readers with access to aria-describedby which links to this text). In fact in #854 we were asked to make this description more available to everyone which is what led to the button as it appears now. I worry that hiding it under an icon, or under the menu we hope to implement as part of this makes it less accessible.

Happy to hear other's thoughts on this.

rviscomi commented 3 years ago

I'd like to explore ways to keep the "Show description" functionality accessible and discoverable while tidying up the UI. I do wonder if it would be a better fit in the figure menu options with the other metadata. I've reread #854 and my understanding is that it was important to make long-form descriptions of figures available to everyone, but the prominence of the show/hide functionality wasn't as critical. I think as long as we make the show/hide functionality available to everyone, the original intent of the feature is preserved.

📟 paging @juliemoynat in case they have opinions on this.

juliemoynat commented 3 years ago

Hi, the ticket is closed but I just would like to add a link about why you should not rely on icon-only buttons: https://axesslab.com/icons-ruining-interfaces/

Icons without label are not clear for everyone. Moreover, it can make it more difficult for people navigating with voice control to tell which button they want to click.

This is just for your information because nothing has finally changed in the interface so, thank you ;-)

rviscomi commented 3 years ago

Thanks @juliemoynat that's a good point. I wonder to what extent we can make assumptions about our users' familiarity with this kind of UI [anti]pattern, for example can we assume that they know that a three-dot icon is a menu? Or if there's a chance that anyone would be unfamiliar, does that make the question moot and necessitate a more descriptive label?

I suppose we could add a "More" or "Options" label to the button, which may also require an adjustment to the layout.

juliemoynat commented 3 years ago

@rviscomi It depends :-)

For example, for the Web Almanac website, I think we can reasonably imagine that users know what is the three-bars icon on the top-right corner in mobile view. For other websites, I personally used to add the work "Menu" beside the icon.

For a three-dot icon in the Web Almanac website, I think it can depend on where it is and on the graphic design. This kind of icon is often not perceivable because it is a really thin icon. I would prefer to have a visible label because it's more accessible anyway (more understandable, more accessible with voice control, more visible). But this is not an obligation as long as you put at least a visually hidden label inside the button (like this : <button type="button"><span class="sr-only">Options for [whatever it does]</span></button>).

tunetheweb commented 3 years ago

But this is not an obligation as long as you put at least a visually hidden label inside the button (like this : ).

We have that already!

<div class="figure-dropdown nav-dropdown">
    <button class="nav-dropdown-btn" aria-expanded="false" title="Figure options…">
      <span class="visually-hidden">Explore the results</span>
      <svg aria-hidden="true" width="1em" height="1em" viewBox="0 0 16 16" fill="currentColor" xmlns="http://www.w3.org/2000/svg">
        <path fill-rule="evenodd" d="M9.5 13a1.5 1.5 0 1 1-3 0 1.5 1.5 0 0 1 3 0zm0-5a1.5 1.5 0 1 1-3 0 1.5 1.5 0 0 1 3 0zm0-5a1.5 1.5 0 1 1-3 0 1.5 1.5 0 0 1 3 0z"></path>
      </svg>
    </button>

Though just spotted that cheeky title which is hardcoded and not language dependent so changed that to same as the "Explore the results" text.

I still think the Show Description doesn't belong in this menu and so have kept it outside for now where it has a more prominent, text-labelled button.

juliemoynat commented 3 years ago

@bazzadp Ah! I hadn't seen that icon-button. I see it now. I fully agree with you: the "Show description" element is at a better place bellow the image with a clearly visible labelled button. This is easier to find (you don't have to guess that there is a long description hidden behind a small icon-button). And it is more in the spirit of WCAG (see the G74 technique).