davidacm / NVDA-IBMTTS-Driver

This project is aimed at developing and maintaining the NVDA IBMTTS driver. IBMTTS is a synthesizer similar to Eloquence. Please send your ideas and contributions here!
GNU General Public License v2.0
56 stars 23 forks source link

Chunk large portions of incoming text to avoid hangs? #110

Open ultrasound1372 opened 6 months ago

ultrasound1372 commented 6 months ago

I'm not entirely sure where all the latency is coming from, and some of it might very well be NVDA. But if you send large blobs of text, thousands of characters, a delay is introduced. Under 6k characters or so it simply manifests as a short delay before speech, which while undesirable isn't that big of a deal. But going beyond that starts going into downright hang territory rather quickly. I've seen this hang truly lock up NVDA as a whole not just the speech thread, and in some circumstances actually hanging the text editor the large blob on one line came from. How much of it is due to the rather expensive transforms we're performing on the data and how much is the synthesizer itself I don't know. Would it be possible to institute some kind of chunking algorithm similar to what say all does for strings beyond a few thousand characters? And then perhaps only apply the complex transforms on the pieces, which cuts down on the search space?

Neurrone commented 5 months ago

I just ran into this exact issue in https://github.com/nvaccess/nvda/issues/16307

I didn't realize that Eloquence was capable of causing a severe reaction like hanging NVDA.

akash07k commented 5 months ago

Yep, sadly I am also impacted with this issue and seing it since quite some time with IBM TTS now. I hope @davidacm can look into this in his free time. Thanks in advance.

amirsol81 commented 1 month ago

@ultrasound1372 @davidacm Any chance of doing something about this? I encounter this issue more and more. Interestingly, eSpeak-NG isn't affected by it.

titet11 commented 1 month ago

@Neurone @akash07k @amirsol81 @davidacm This may be a processor issue and it is quite possible that even without having NVDA running, the same issue is occurring.

On the other hand, NVDA may try to capture the text on the page each time the arrows are pressed. In this matter, an exception could be implemented so that pressing the arrows does not retrieve the text of the page.

Considering the idea of ​​splitting text, this could cause delays if you scroll through different paragraphs in a large text and would therefore significantly impact performance.

I think the most appropriate thing in these cases would be to find out the NVDA source code to understand how elements and texts are captured.

This not only happens in web browsers, but also in interfaces such as file explorer: For example: When you have more than 1000 files in a folder, if you scroll through items, you will notice that scrolling becomes quite slow.

The solution I do to avoid delays is to go to the folder properties and put all the files in read-only mode:

Perhaps if NVDA ran large texts in read-only mode without the write attribute, it would noticeably improve performance. It could be a simple and quite effective solution.

So my conclusion is that when capturing a text in any interface you could only apply the read-only mode and not integrate the execute or write attribute.

Perhaps this solution can be applied in this plugin, although it would be most appropriate for NVDA itself to implement it.

Neurone commented 1 month ago

Tagging my “cousin” @Neurrone :D

ultrasound1372 commented 1 month ago

@titet11 your comment makes barely any sense, and appears to be conflating multiple issues. Files in file explorer should have absolutely nothing to do with it, unless the file names are thousands of characters. The rest of the delay is caused by explorer and NVDA rather than the synthesizer. As for web browsers, once the page is actually loaded the virtual buffer does some kind of line splitting, such that arrowing can never trigger this to my knowledge. At least with screen layout on. Unsure about off. This is specifically related to sending bare large strings directly at speech, E.G. lines in a text editor that are super long and have no line wrapping in affect in the editor, or arbitrary length speech messages from a game communicating directly with NVDA.

titet11 commented 4 weeks ago

@ultrasound1372 okay, I'm sorry, I'm not a programmer.

davidacm commented 3 weeks ago

I will look into this problem. But is hard to replicate. I tried opening the link mentioned on the related issue. And although NVDA froze, it didn't the second time I opened the link. Indicating that it was not a synthesizer problem. For those who had good results opening that link with Espeak, perhaps it was because they had previously opened it with IBMTTS or another synth.

Can you send me ways to replicate this issue?