shama / grunt-spell

:boar::abc: A Grunt plugin for spellchecking
MIT License
20 stars 3 forks source link

Exclude code example blocks #7

Open pavelbinar opened 10 years ago

pavelbinar commented 10 years ago

I am checking bunch of markdown files and I would like to exclude any of the code blocks:

```js
var myVar = require('plugin');


otherwise I get suggestion for almost each word: var, myVar,..

Any idea?

Great plugin, thanks!
shama commented 10 years ago

atm there isn't a way besides using another task to first parse the markdown and extract the parts you want to spell check. Marking as an enhancement as I agree it would be a neat feature.

pavelbinar commented 10 years ago

All right. What is the best input format for spellcheck? I have tried to spellcheck .html file but getting error:

(! 602)-> grunt spell:all
Running "spell:all" (spell) task
>> Checking tmp/all-docs.html...
>> Error: Unexpected close tag Line: 5 Column: 7 Char: >

Done, without errors.
shama commented 10 years ago

Plain text I suppose. It doesn't know how to read any specific file format; it can only spell check text. So any html, markdown, textile, javascript, css, etc file you throw at it, it will attempt to spell check everything within the file, including the file format's syntax.

pavelbinar commented 10 years ago

Understood, thank you for explanation. "Nice to have" - if the plugin would recognize file type and escape specific file type syntax like

, ### and so on.

shama commented 10 years ago

I want to leave this issue open as a reminder. I think it would be nice to have.

But rather than having users add new syntax, it would be cool to make a library that parses out the syntax of existing formats. Then we could just use that lib here.

pavelbinar commented 10 years ago

:+1:

huntie commented 10 years ago

+1

For the time being it would be nice to be able to just add a list of strings to ignore.

shama commented 10 years ago

Related to #8. It should actually parse HTML files but is failing because of the bug shown in #8.

optikalefx commented 10 years ago

Plus one here. It really sucks that sublime can check this, but my build process can't. I have 300 view files and checking them manually is not why we became programmers.

shama commented 10 years ago

Patches welcome ;)

rkoberg commented 9 years ago

You can run an XSL transformation on markup to extract just the text. If your markup is not well-formed, you can use an XML parser like tagsoup:

<xsl:stylesheet
  version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:html="http://www.w3.org/1999/xhtml"
  xpath-default-namespace="http://www.w3.org/1999/xhtml"  exclude-result-prefixes="html">

  <!--
  Command line transform:

  $ export CLASSPATH=$CLASSPATH:saxon9he.jar:tagsoup-1.2.1.jar
  $ java net.sf.saxon.Transform -o:build/text-only.txt -s:local.xml -xsl:src/text-only.xsl -x:org.ccil.cowan.tagsoup.Parser

  Requires both saxon and tagsoup in the classpath
  -->

<xsl:strip-space elements="*"/>
  <xsl:output indent="no" method="text"/>

  <xsl:template match="/conf">
    <xsl:apply-templates select="collection('../app/ela?select=*.html&amp;recurse=yes')/*"/>
  </xsl:template>

  <xsl:template match="/html">
    <xsl:variable name="doc-uri" select="document-uri(/)"/>
    <xsl:variable name="out-path" select="concat('text-only', substring-after($doc-uri, 'app/ela'))"/>
    <xsl:value-of select="$out-path"/>
    <xsl:text>
</xsl:text>
    <xsl:result-document href="{$out-path}.txt">
      <xsl:apply-templates/>
    </xsl:result-document>
  </xsl:template>

  <xsl:template match="script" priority="10"/>

  <xsl:template match="text()" priority="10">
    <xsl:value-of select="."/>
    <xsl:text> </xsl:text>
  </xsl:template>

  <xsl:template match="@*|node()">
    <xsl:apply-templates select="node()"/>
  </xsl:template>

</xsl:stylesheet>