redpen-cc / redpen

RedPen is an open source proofreading tool to check if your technical documents meet the writing standard. RedPen supports various markup text formats (Markdown, Textile, AsciiDoc, Re:VIEW, reStructuredText and LaTeX).
https://redpen.cc
Apache License 2.0
564 stars 74 forks source link

RedPen commands support to summarize input documents #773

Closed takahi-i closed 7 years ago

takahi-i commented 7 years ago

I would like to support redpen command to supports to summarize the stats of input documents.

Option

Users specify -s option to get the stats of input doucments.

$redpen -s -f markdown input.md
{
   "number_of_sentence" : 103,
   "mean_of_sentence_length" : 89.9,
   "number_of_characters" : 8999,
    "longest_sentence" : {
         "sentnece": "this is a long long long .... sentence", 
         "position": {8,7},
         "file": input.md
     }
}

Configuration

<redpen-conf lang="en">
    <validators>
        <validatorr name="LongestSentence"/ >
        <validaotr name="ShortestSentence"/ >
        <validator name="MeanSentenceLength"/ >
        <validator name="SectionLength"/ >
    </validators>
    <symbols>
         <symbol name="EXCLAMATION_MARK" value="!" invalid-chars="!" after-space="true" />
         <symbol name="LEFT_QUOTATION_MARK" value="\'"  invalid-chars="“" before-space="true" />
    </symbols>
</redpen-conf>

Output format

FILE_NAME:LINE_NUM: Extraction[EXTRACTION_TYPE], MESSAGE at line: SENTENCE

Implementation

Define Abstract Extractor class and inherit the class for the implementation. RedPen class provides extract(Document) methods to run all the registered extractors.

takahi-i commented 7 years ago

I will take this.

takahi-i commented 7 years ago

Merged with #504.