NSoiffer / MathCAT

MathCAT: Math Capable Assistive Technology for generating speech, braille, and navigation.
MIT License
62 stars 35 forks source link

Read binomial coefficients notated as 2*1 matrices in the form "n choose k" #76

Open bhavyashah opened 1 year ago

bhavyashah commented 1 year ago

As seen in the attached sample, some people visually notate binomial coefficients (nck or n choose k) as a 21 column matrix or like a fraction with a numerator and denominator. While I usually want a speech synthesizer to read things as is, that philosophy doesn't cross-apply to Math as much for me, and certainly not in this case. In particular, it was quite non-ideal to hear about tons of non-existent matrices all throughout my probability class. Especially in more dense Math expressions, verbosity can make comprehension more challenging. All of that to say that I'd like to request that binomial coefficients, no matter the notation used, to be read consistently as "n choose k" or "n c k." Alternatively, consider adding a speech style where different notations for the same mathematical expression are read identically (I can't think of specific examples other than binomial coefficients at the moment but I'll add a comment when some come to mind). Implementationally, I admit that I am unsure how Math CAT would tell apart a binomial coefficient from a genuine 21 matrix. Side Note: GitHub does not seem to allow .html attachments. I am renaming the actual .html file to .txt to get it across; please rename it back to view it in a web browser (Edge in my case). Most of my issue reproduction samples will be HTML files - please let me know if there is a better method of sharing them. Choose Read as Column Matrix.txt

NSoiffer commented 1 year ago

With intent and MathML, this would be easy. Of course that doesn't help now and won't help in the future with legacy MathML.

As you noted, there doesn't seem to be any good way to distinguish which is meant. In some cases, maybe you can tell that the value should be a scalar and not a matrix, but in this case, you can't.

If you are using ClearSpeak, there is an option to control this. Under the ClearSpeak preference list, there is Matrix. If you set this to Combinatorics, you will get the behavior you want. This is not a preference that is currently in the MathCAT settings dialog. Instead, look for prefs.yaml in

  1. %AppData%\Roaming\MathCAT
  2. %AppData%\Roaming\nvda\addons\MathCAT\globalPlugins\MathCAT\Rules

If the first file doesn't exist, make the change to the second file.

I think MathCAT will detect the change to the pref.yaml file. If it doesn't change the speech immediately, restart MathCAT or NVDA.

Please let me know if this solves your problem. Also whether you needed to restart MathCAT.

bhavyashah commented 1 year ago

In the ClearSpeak block in the pref.yaml file, the only matrix-related line I found read "Matrix: Auto" which I changed to "Combinatorics: Auto." I then restarted NVDA and the expression was read exactly as before. What step am I missing?

NSoiffer commented 1 year ago

Change the value of Matrix to Combinatorics. That is:

Matrix: Combinatorics

I just tried it and it worked for me. No reload was needed. It will only affect 2x1 matrices.

Make sure you make the change in %AppData%\Roaming\MathCAT\prefs.yaml if it exists since it overrides the other file's values.

bhavyashah commented 1 year ago

I interpreted your instruction incorrectly earlier. I made the change and it took effect without the need for a restart. It reads it perfectly now. I am pretty happy with this work-around. Maybe Math CAT should include in its documentation a brief explanation about the prefs.yaml file and links to, for instance, the ClearSpeak documentation for further information on its settings?

bhavyashah commented 1 year ago

I am coming across other examples of common notation being misread by MathCAT due to a lack of context-awareness. Consider the following: Let $G=(V, E)$ be a general directed graph The set of vertices and edges is misunderstood as an open interval and verbosely and misleadingly read as "clickable g is equal to the interval from v to e not including v or e " I now have to switch back from Combinatorics to Auto because my current class depicts some points in R2 as 21 matrices and it would be nice to not need to go back and forth manually. I think models like GPT and PaLM can probably reliably judge context and give us the information we need to choose which reading to go with. Here's how I'm conceiving of this: (i) we build a list of all the kinds of expressions where context is necessary, (ii) we list out for each kind of expression the set of possible contexts (e.g. a 21 matrix is either a column matrix or a combinatorial identity), (iii) when we encounter such an expression, we give the LLM that expression instance and some surrounding text (Thow much TBD) and the context set, (iv) we use the model's selection from the context set to determine how to read the expression. This may seem like an overkill for a minor issue, but I think context-awareness seems like the only way to correctly read Math in a decent number of scenarios and LLMs seem like a good bet to acquire that context-awareness. I don't know how computationally expensive this will be and whether it can even be done real-time reasonably efficiently or if document pre-processing might make more sense.