silverstripe / silverstripe-framework

Silverstripe Framework, the MVC framework that powers Silverstripe CMS
https://www.silverstripe.org
BSD 3-Clause "New" or "Revised" License
721 stars 822 forks source link

Remove spellchecker from default setup: Uses discontinued/unofficial GoogleSpell API #2213

Closed chillu closed 11 years ago

chillu commented 11 years ago

The spellchecker plugin uses an internal, undocumented Google service. It has been discontinued many years ago, and was never supposed to be used as an API. At the moment, I can't get any response from it (404), so did other people back in April. TinyMCE just reports "no misspellings found".

http://www.tinymce.com/forum/viewtopic.php?id=30779 https://tracker.moodle.org/browse/MDL-38867 http://www.tinymce.com/wiki.php/Plugin3x:spellchecker

curl -sd '<?xml version="1.0" encoding="utf-8" ?><spellrequest textalreadyclipped="0" ignoredups="0" ignoredigits="1" ignoreallcaps="1"><text>spel</text></spellrequest>' https://www.google.com/tbproxy/spell

I would suggest that we remove this as a default option from our TinyMCE configuration. This makes for a worse user experience, but not as bad as relying on a service which might go down any minute, right? In terms of CWP, we could use PSpell on our servers?

dhensby commented 11 years ago

Doesn't TinyMCE let you choose to use the browser spellcheck now? So this should be fine of UX as long as that's enabled...

chillu commented 11 years ago

Nice, didn't know that! http://www.tinymce.com/wiki.php/Configuration:browser_spellcheck Do you have any info on which browsers are supported? I guess there's edge cases where you're editing a language which isn't a supported one on your operating system or the browser default, but it might be a good start.

dhensby commented 11 years ago

It's just whatever browsers have spellchecks. They use HTML attribute spellcheck='false' on the field to enable/disable browser spellchecks

dhensby commented 11 years ago

The alternative, of course, is to include a server-side spellcheck as part of the framework - hehe

dhensby commented 11 years ago

Also, I imagine those edge cases exist with any spell checking service

hafriedlander commented 11 years ago

We need some sort of spellchecking - from a quick Google around it looks like IE didn't have built in checking until IE10. I'm happy with replacing the Google API with a local system like PSpell though - even if it worked, sending content to Google is a potential privacy issue.

mateusz commented 11 years ago

+1 for PSpell, it also allows you to hook up custom dictionaries with extra wordlists. I think we'd need to document that pspell needs to be installed on the server for this to work.

chillu commented 11 years ago

OK, so browser spellcheck sounds like a good baseline solution enabled by default then? Dan, maybe you have time to do dome testing with browser_spellcheck? Ideally we'd only show the icon if the browser supports it via feature detection. Maybe trawl through the plugin code first? https://github.com/tinymce/tinymce/tree/master/js/tinymce/plugins/spellchecker/classes Bonus points for figuring out if the browser language matches the content language, although that seems to be done by browser heuristics. Remember that we could have the CMS UI in english, but the content language in Maori/Te Reo or something.

Then, we can test and document PSpell setup, at least pointing to the right SS config setting and an external PSpell setup guide.

dhensby commented 11 years ago

Hmmm, looking through their code is gooooooood fun!

Well, I looked through the TinyMCE source that's currently in the 3.1 branch of framework and there's not that many references to the browser_spellcheck. It seems to set a property on the document body (not sure if this is of the parent page or the body of the TinyMCE Editor... I assume the latter.

I don't have masses of time to check spellchecking, but we do have a multi-lingual site under construction now, so I'll mention it to the boys.

Oh, and I found this about feature detection http://blog.whatwg.org/the-road-to-html-5-spellchecking#detection

icecaster commented 11 years ago

you can switch to use pspell by simply apt-get install php5-pspell and changing the config in framework/thirdparty/tinymce-spellchecker/config.php: (comment googlespell, enable pspell)

//$config['general.engine'] = 'GoogleSpell'; $config['general.engine'] = 'PSpell';

Spellchecker works as expected then... Tested on ubuntu desktop & server.

Shall I send a pull request?

Cheers

chillu commented 11 years ago

Hey @icecaster, can you check that it uses the correct language for pspell (configured through i18n::$default_locale)? And that it complains somehow if this locale can't be found in the installed pspell? I don't think a pull request to change this config makes much sense, its more a matter of disabling the spellchecker, relying on the browser if available, and documenting how to enable the spellchecker with the $config line you mention.

colymba commented 11 years ago

Doing some research on this subject I came across this thread: http://www.tinymce.com/forum/viewtopic.php?id=30779

Apparently there is a new Google API.. also undocumented.... and they seem to have released a .Net version of the plugin. Looking at the source of the new plugin I did some test with an updated GoogleSpell class: https://gist.github.com/colymba/6130162 . This actually works and spelling seems to be checked properly, returning suggestions to TinyMCE is still buggy though...

'browser_spellcheck' => true work fine in FF and Chrome but giving me issues with IE10 (probably just works on input tags, so no good? http://ie.microsoft.com/testdrive/Browser/SpellChecking/ ). There is also a 'gecko_spellcheck' => true that seems to do the same....

Am personally in favour of using the browser spell check even if it is not very cross browser. But at least avoid depending on changing API.

chillu commented 11 years ago

Submitted a solution in https://github.com/silverstripe/silverstripe-framework/pull/2288.

It looks like Chrome auto-detects the language you're writing in, as long as the dictionary is installed. So I can write an English paragraph followed by a German paragraph in the same text area and it'll magically work. Firefox needs explicit switching (docs). Neither of these browsers seem to take hints from the "lang=" attribute, so no action required by us. We add <body lang> already, but that's based on the users profile setting, which might not correlate with the actually edited language (Te Reo being a good example).

I agree with @colymba that even if there's a new way of accessing Google's internal APIs, its nothing we should continue relying on as a default setup. Devs are free to decide that for themselves if they want to use Google. Once the new hacks make their way into TinyMCE (and subsequently our inclusion), they can simply enable it through the config.php setting.