NSoiffer / MathCAT

MathCAT: Math Capable Assistive Technology for generating speech, braille, and navigation.
MIT License
54 stars 33 forks source link

Excess pause bug #208

Open bhavyashah opened 10 months ago

bhavyashah commented 10 months ago

Consider $f(x ; \theta)=\theta^{-2} x \exp (-x / \theta), \quad x>0$. There is an unusually long pause before the comma. If this is extraneous and was an error, feel free to fix it. If however this pause length in this case is intentional, I'm using ESpeak-NG at 35% with rate boost enabled and relative speech rate at 60 and pause factor at 50 and I moved on to the next line the first couple of times before the ",x>0" ending.

NSoiffer commented 10 months ago

Transfering to the MathCAT repo

FYI: MathCAT generates the speech and braille and handles the internals of navigation. MathCATForPython deals with specifics relating to NVDA such as voices, language switching, the MathCAT dialog, and interactions such as "copy".

NSoiffer commented 10 months ago

MathCAT generates a long pause after the complicated first part. That is intentional. However, the pause should probably be after the "comma", not before it. That is should be true for any punctuation, not just comma. Do you feel the pause is wrongly placed, or that the pause is unwarranted, or that the pause is too long?

I didn't exactly match your settings, but here is what is sent to the speech engine (I added some linebreaks for readability -- the second to the last line is the long break):

<prosody rate="56%">
<say-as interpret-as="characters"><voice xml:lang="en-us">f</say-as>  of    <break time="118ms" /> 
 open paren    <say-as interpret-as="characters">x</say-as>  semicolon  theta    <break time="118ms" />  close paren    <break time="236ms" /> 
 is equal to    <break time="236ms" />  theta to the negative 2 power    <break time="118ms" />      <say-as interpret-as="characters">x</say-as>      <break time="118ms" />  
exponential  of    <break time="118ms" />
  open paren    <break time="118ms" />  negative    <say-as interpret-as="characters">x</say-as>  divided by  theta    <break time="118ms" />  close paren  
<break time="827ms" />  comma  
<say-as interpret-as="characters">x</say-as>  is greater than  0    <break time="1ms" /></prosody><mark name="1110" /></voice>

The pausing is scaled to the speech rate so if the relative rate had not been set to 56%, the pause would have been shorter. However, in looking at the code, it appears that "rate boost" is not part of the result when MathCAT asks for the current speech rate. So for the above, all of the breaks should be reduced by a factor of 3.

NSoiffer commented 10 months ago

I added a check for rate boost (https://github.com/NSoiffer/MathCATForPython/commit/1acae903ea23d925133c1d176648352a417aac5f), so the pause is much less than before. I'm about to do a release, so it will be in the release.

I need to see if switching the order of punctuation and pause is simple or complicated. If complicated, it will have to wait.

bhavyashah commented 10 months ago

Sorry for reporting in the wrong repo - will try to be more accurate in the future. Adding a rate boost check to make the pause appropriately short sounds great to me. Changing the order of pause and punctuation sounds nice but I don't think is vital for me.