`internalSpeechTitles` for SVG output doesn't seem to work

nedredmond commented 7 months ago

Issue Summary

internalSpeechTitles, which defaults to true, is an SVG output option that promises SRE-generated speech in the title attribute. See the docs.

However, no speech appears in the DOM.

It appears to be invoked here. The handler then makes the following check:

    const speech = (attributes.get('aria-label') || attributes.get('data-semantic-speech')) as string;
    if (speech) {
        ...

If I'm understanding that right, it will only add speech if there is already an aria-label attribute present? There isn't one by default, so how do I add it to get this benefit?

At present, everything is aria-hidden, which isn't ideal. For our use case, we need to extract the svg from its HTML wrappers to nest it in another svg, so the hidden MathML is useless to us.

Steps to Reproduce:

Look at the TeX-to-SVG demo. The setting is on by default, yet there is no generated speech on any DOM element.

Technical details:

MathJax Version: 3.2
Client OS: Mac OS X 13.6.1
Browser: (e.g., Chrome 121.0.6167.184)

We are creating a custom component, but it doesn't appear in the default configuration either per the demo, which should have internalSpeechTitles set to true.

Supporting information:

Linked in the description above. Per the code, there should be generated speech, but only if the element already has an aria-label-- see the default output in the demo page. How can we make this automatic speech work as described?

dpvc commented 7 months ago

internalSpeechTitles, which defaults to true, is an SVG output option that promises SRE-generated speech in the title attribute.

If you look at the description of this option, you will see that it says

This tells the SVG output jax whether to put speech text into <title> elements within the SVG (when set to 'true'), or to use an aria-label attribute instead. Neither of these control whether speech strings are generated (that is handled by the Semantic-Enrich Extension Options settings); this setting only tells what to do with a speech string when it has been generated or included as an attribute on the root MathML element.

I have bolded the key statement, which is that this option does not control whether speech strings are generated, only what to do with them if they are. The generation of speech strings is controlled by the Speech-Rule-Engine (SRE) options.

If I'm understanding that right, it will only add speech if there is already an aria-label attribute present?

No, it will add speech if either and aria-label is already present, or if data-semantic-speech is present. The latter is what SRE generates, so if SRE's speech generation is enabled, it should provide the speech string in data-semantic-speech, which will in turn be placed in the title element.

It is quite possible that this is broken, however, as SRE has evolved since this was first written, and this code might not have kept up. I do know that SRE didn't always place a data-semantic-speech attribute on the math item, and that might be part of the problem, here. All of this is being re-worked for v4, and the speech is being added later in the pipe-line, so the internalSpeechTitles option may change or be removed in the future.

Are you doing this in the browser, or as a node application? Once I know that, I can try to work out a solution for you in v3.

At present, everything is aria-hidden, which isn't ideal. For our use case, we need to extract the svg from its HTML wrappers to nest it in another svg, so the hidden MathML is useless to us.

You can disable the hidden MathML so it is not generated (it will be off by default in v4). The aria-label and aria-braille label are on the mjx-container element, which is why its children are aria-hidden. You could transfer the aria-label from the container to the svg element and then remove the aria-hidden tag from it.

nedredmond commented 7 months ago

Thank you, @dpvc! My apologies-- I didn't see that part of the option description.

This is in the browser, using a custom implementation with MathJax internals.

The SVG part looks a bit like this (stripped down, approximated):

    getTexToSvgConverter = () => {
        if (!windowGlobalExists()) {
            throw new Error("Can't render TeX without a DOM environment");
        }

        return mathjax.document("", {
            InputJax: new MathJaxTexInput({...texInputConfig}),
            OutputJax: new SVG({fontURL: this.fontURL}),
        });
    };

    svgDom = this.getTexToSvgConverter().convert(
        tex, {internalSpeechTitles: true},
    );

    return svgDom.firstElementChild;

What are the appropriate SRE settings to activate this behavior?

nedredmond commented 7 months ago

I tried adding the rule here, to no avail:

return mathjax.document("", {
    InputJax: new MathJaxTexInput({...texInputConfig}),
    OutputJax: new SVG({fontURL: this.fontURL}),
    sre: {
        speech: "shallow",
    }
});

dpvc commented 7 months ago

A couple of questions:

Is MathJaxTexInput a subclass of the TeX input jax?
What Mathjax imports are you doing?
Have you added the EnrichHandler when you register the HTML handler?

dpvc commented 7 months ago

You need more than just adding the sre.speech option, as you will need to import the semantic-enrichment and speech generation modules (since you are loading modules directly rather than using the components framework).

Here is an example of a node script that I think does what you want:

//
//  Load the packages needed for MathJax
//
import {mathjax} from 'mathjax-full/js/mathjax.js';
import {MathML} from 'mathjax-full/js/input/mathml.js';
import {TeX} from 'mathjax-full/js/input/tex.js';
import {SVG} from 'mathjax-full/js/output/svg.js';
import {RegisterHTMLHandler} from 'mathjax-full/js/handlers/html.js';
import {liteAdaptor} from 'mathjax-full/js/adaptors/liteAdaptor.js';
import {STATE} from 'mathjax-full/js/core/MathItem.js';
import {EnrichHandler} from 'mathjax-full/js/a11y/semantic-enrich.js';

//
// The SRE language files are loaded dynamically at run time.
// If you want to webpack this so that the SRE mapping files are contained
//   in the component, then you can try uncommenting the following imports
//    and the MathMaps calls that follow.  You can comment out the global.SREFeature
//    in the middle.
//
//import base from 'speech-rule-engine/lib/mathmaps/base.json' assert { type: 'json' };
//import en from 'speech-rule-engine/lib/mathmaps/en.json' assert {type: 'json' };
//import MathMaps from 'mathjax-full/js/a11y/mathmaps.js';

global.SREfeature = {json: './node_modules/speech-rule-engine/lib/mathmaps/'};
//MathMaps.default.set('base', base);
//MathMaps.default.set('en', en);

//
// Minimal CSS to make SVG self-contained
//
const CSS = [
  'svg a{fill:blue;stroke:blue}',
  '[data-mml-node="merror"]>g{fill:red;stroke:red}',
  '[data-mml-node="merror"]>rect[data-background]{fill:yellow;stroke:none}',
  '[data-frame],[data-line]{stroke-width:70px;fill:none}',
  '.mjx-dashed{stroke-dasharray:140}',
  '.mjx-dotted{stroke-linecap:round;stroke-dasharray:0,140}',
  'use[data-c]{stroke-width:3px}'
].join('');

//
// Register the adaptor and handlers
//
const adaptor = liteAdaptor();
EnrichHandler(RegisterHTMLHandler(adaptor), new MathML());

//
// The TeX and SVG options
//
const texOptions = { /* your options here */ };
const svgOptions = {fontCache: 'local'};

//
// Create the MathDocument to use
//   (You can reuse this for multiple expressions, though the labels and definitions
//    will carry over from one to the next)
//
const html = mathjax.document('', {
  InputJax: new TeX(texOptions),
  OutputJax: new SVG({fontCache: 'local'}),
  enableEnrichment: true,
  sre: {
    speech: 'shallow'
  },
  renderActions: {
    //
    //  This actions removes the data-semantic attributes from all the MathML nodes
    //    and adds the data-semantic-speech attribute to the math node.
    //
    speechAdjust: [
      STATE.ENRICHED + 1,
      () => {},
      (math, doc) => {
        const speech = math.getSpeech(math.root);
        math.root.walkTree((node) => {
          const attributes = node.attributes?.getAllAttributes() || {};
          for (const name of Object.keys(attributes)) {
            if (name.substring(0, 13) === 'data-semantic') {
              delete attributes[name];
            }
          }
        });
        math.root.attributes.set('data-semantic-speech', speech);
      }
    ],

    attachSpeech: []  // remove original attachSpeech action
  }
});
//
// Uncomment the following if you want internal <title> nodes rather than the aria-label
// for the speech string
//
//html.options.internalSpeechTitles = true;

//
// Add a post-filter to the SVG output jax that either moves the aria-labeledby attribute
// from the math node to the svg node, or moves the speech string to the SVG aria-label.
// Then we removes the aria-hidden attribute.  Finally, we add the minimal CSS
// needed to make the SVG be self-contained.  (You can put that CSS into a <style>
// tag or stylesheet instead, and only use it once, if you prefer.)
//
html.outputJax.postFilters.add(({math, data: root}) => {
  const svg = adaptor.tags(root, 'svg')[0];
  if (html.options.internalSpeechTitles) {
    const node = adaptor.parent(adaptor.tags(root, 'title')[0]);
    adaptor.setAttribute(svg, 'aria-labeledby', adaptor.getAttribute(node, 'aria-labeledby'));
    adaptor.removeAttribute(node, 'aria-labeledby');
    adaptor.removeAttribute(node, 'data-semantic-speech');
  } else {
    adaptor.setAttribute(svg, 'aria-label', math.root.attributes.get('data-semantic-speech'));
  }
  adaptor.removeAttribute(svg, 'aria-hidden');
  const defs = adaptor.tags(svg, 'defs')[0];
  adaptor.append(defs, adaptor.node('style', {}, [adaptor.text(CSS)]));
});

//
// Convert the expression, with proper handling of retries (needed for SRE to initialize itself).
// Currently there is no way to avoid the initial retry for SRE, so there needs to be one
//   promise-based call to html.convert(), but after that, you should be able to work synchronously.
//   Because you are using direct imports of the MathJax modules rather than the component based
//   calls, you won't be able to use \require or auto-loaded packages.  So be sure to import any TeX
//   packages you will need and add them to the tex packages configuration option.
//
const container = await mathjax.handleRetriesFor(() => {
  return html.convert(process.argv[2] || '', {
    display: true,
    em: 16,
    ex: 8
  });
});

//
// Print the serialzied SVG element
//
console.log(adaptor.outerHTML(adaptor.tags(container, 'svg')[0]));

For browser use, you will want to change the liteAdaptor to the browserAdaptor, but the rest should be able to be incorporated into your work-flow. Do read the comments, as they give information about some things you need to take into account. The SVG port-filter could be made into another renderAction that runs after the typeset action (use STATE-TYPESET + 10 or soothing like that).

dpvc commented 7 months ago

PS, you might need to set aria-hidden on the top-level children of the resulting svg element. I didn't do that here.

nedredmond commented 7 months ago

@dpvc Sorry, MathJaxTexInput is just an alias for TeX, the input jax.

mathjax / MathJax