joelpurra / talkie

Text-to-speech browser extension button. Select text on any web page, and have the computer read it out loud for you by simply clicking the Talkie button.
https://joelpurra.com/projects/talkie/
GNU General Public License v3.0
70 stars 17 forks source link

[FR] read an entire web page or article #14

Open fazlerabbi37 opened 4 years ago

fazlerabbi37 commented 4 years ago

I was searching for an open source text-to-speech reader that would allow me to read an entire web page or article without user interaction. I can across talkie which is an awesome tool so thanks from a grateful user. The only problem I am facing is that for me to read an entire article I need to select a part of that article, right click and select talkie again and again. It would be great if talkie could automatically select and highlight the text of the article and play the sound.

The closest one that matches my description is a close source product named Natural Reader Text to Speech.

Thanks in advance!

joelpurra commented 4 years ago

@fazlerabbi37: happy that you like Talkie! Automatically selecting the relevant text to read has been in the idea box for a long time.

The problem: I imagine auto-reading a page, and the extracted text starts with "Facebook Twitter Instagram Medium Email Share Please share this article if you like it! Author John Doe john.doe@example.test Contact John Follow John Facebook Twitter ..." then the real page content. Plus intermissions with text-remnants of "Ads provided by AmazingAdNetwork Responsible advertising See Privacy policy" and more social sharing buttons. (The lack of punctuation is because visually these are separate graphical elements, but text-wise "visual elements" with CSS are harder to separate from non-visual markup.) When this happens, I think users will complain a lot, and that is why Talkie doesn't have auto-reading (yet). Does the explanation make sense to you?

A generic, automatic "text extraction" which works great in 5% of the cases might be easy to implement -- if those 5% are relatively simple but well-implemented websites. Unfortunately, I think it's not easy at all to implement something which works all the time. Even getting above 10% might be hard, and everything else probably requires a fairly advanced (although perhaps repetitive with minor differences) set of rules -- pretty much one for each website, as so many are customized or have little quirks. (Some blog systems/services, such as wordpress, might have themes which effectively are the same though.)

I'd rather have a discussion about possible solutions than a quick "5% hack". Any ideas?

fazlerabbi37 commented 4 years ago

Thank you @joelpurra for the detailed explanation! It does make a lot of sense. And I also agree that it shouldn't be a quick hack however I have little to no experience with text extraction. Parhaps we could take a look at the existing solutions? Like the one I mentioned Natural Reader Text to Speech does it fairly well. Though I am sure they are using some kind of proprietary code to do that but maybe it could give you a direction helping us to understand the situation better.


Off-topic: How do I get the premium version? Chrome store says pricing is not available. Can you point me to somewhere where I can know more or discuss it? Don't want to clutter original topic with it.

joelpurra commented 4 years ago

Best not to look at proprietary code...


If pricing is unavailable, you can email me or contact google. I doubt I can do anything though; I have already enabled sales to all currencies/countries. You can also build and load Talkie Premium in developer mode, or email me for the "official" build.

fazlerabbi37 commented 4 years ago

Best not to look at proprietary code...

Well I was asking you to take a look at how it works and then the proprietary code was about we not be able to use it for this project. :stuck_out_tongue: