erdomke / RtfPipe

Library for processing rich-text format (RTF) streams
MIT License
121 stars 44 forks source link

RtfPipe.Converter.Text.RtfTextConverter #12

Closed 4g0st1n0 closed 6 years ago

4g0st1n0 commented 6 years ago

I'm using RtfPipe.Converter.Text.RtfTextConverter to convert rtf to plainText

With latest release Converters are removed. How can convert Rtf to plain text?

Thank you.

erdomke commented 6 years ago

The Rtf.ToHtml method has an overload where you can pass in an XmlWriter to which the HTML is written. This leaves you with a couple of options:

  1. A separate open source project of mine, BracketPipe, contains XmlWriter classes designed to convert html to Markdown, Textile, and "plain text". (The Markdown writer should be more complete while the Textile writer has a few remaining issues.) Therefore, your code might look something like:
using (var writer = new System.IO.StringWriter())
using (var markdown = new BracketPipe.MarkdownWriter(writer))
{
  Rtf.ToHtml(source, markdown);
  markdown.Flush();
  return writer.ToString();
}
  1. Create your own subclass of XmlWriter and use code similar to above. In general, any calls to WriteString that are not between calls to WriteStartAttribute and WriteEndAttribute should contain the plain text content of the HTML (and therefore, RTF). If you need to perform additional formatting (e.g. separating paragraphs with newlines), you can use the WriteStartElement and WriteEndElement to listen for <br> and <p> tags. Again the PlainTextWriter code gives you a decent example (even if perhaps a touch complicated) of what this might look like.