Document Parser API works well to search & extract formatted text as well as the raw text from a variety of documents of 50+ supported file formats.
Directory | Description |
---|---|
Demos | Source code for live demos hosted at https://products.groupdocs.app/parser/family. |
Examples | C# examples and sample files that will help you learn how to use product features. |
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF, TXT\ Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, CSV, XLA, XLAM, NUMBERS\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP\ Portable: PDF
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF, TXT\ Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, CSV, XLA, XLAM, NUMBERS\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP\ Email: EML, EMLX, MSG\ Markup: HTML, XHTML, MHTML, MD, XML\ eBooks: CHM, EPUB, FB2\ Portable: PDF\ Notes: ONE\ Databases: Databases are supported via ADO.NET. To work with the corresponding database format install its database provider.
Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, XLA, XLAM\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM\ Portable: PDF
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF\ Spreadsheet: XLS, XLT, XLSX, XLSM, XLTX, XLTM, XLA, XLAM\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP\ Email: EML, EMLX, MSG\ Markup: MD (Formatted Text is Not supported)\ eBooks: CHM, EPUB, FB2
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF\ Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, XLA, XLAM, NUMBERS\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP\ Portable: PDF
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF\ Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, XLA, XLAM\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP\ Email: EML, EMLX, MSG\ eBooks: EPUB, FB2\ Portable: PDF
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF\ Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, XLA, XLAM, NUMBERS\ Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP\ Email: EML, EMLX, MSG\ Portable: PDF\ Archive: ZIP
Email: PST, OST, EML, EMLX, MSG\ Portable: PDF\ Archive: ZIP
Portable: PDF
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF\ eBooks: CHM, EPUB\ Portable: PDF\ Databases: Databases are supported via ADO.NET. To work with the corresponding database format install its database provider.
Microsoft Windows: Microsoft Windows Desktop & Server (x86, x64), Windows Azure\ macOS: Mac OS X\ Linux: Ubuntu, OpenSUSE, CentOS, and others\ Development Environments: Microsoft Visual Studio, Xamarin.Android, Xamarin.IOS, Xamarin.Mac, MonoDevelop.\ Supported Frameworks: NET Standard 2.0, .NET Framework 2.0 or higher, .NET Core 2.0 or higher, Mono Framework 1.2 or higher
Are you ready to give GroupDocs.Parser for .NET a try? Simply execute Install-Package GroupDocs.Parser
from Package Manager Console in Visual Studio to fetch & reference GroupDocs.Parser assembly in your project. If you already have GroupDocs.Parser for .Net and want to upgrade it, please execute Update-Package GroupDocs.Parser
to get the latest version.
string connectionString = string.Format("Provider=System.Data.Sqlite;Data Source={0};Version=3;", "database.db");
// create an instance of Parser class to extract tables from the database
// as filePath connection parameters are passed; LoadOptions is set to Database file format
using (Parser parser = new Parser(connectionString, new LoadOptions(FileFormat.Database)))
{
// check if text extraction is supported
if (!parser.Features.Text)
{
Console.WriteLine("Text extraction isn't supported.");
return;
}
// check if toc extraction is supported
if (!parser.Features.Toc)
{
Console.WriteLine("Toc extraction isn't supported.");
return;
}
// get a list of tables
IEnumerable<TocItem> toc = parser.GetToc();
// iterate over tables
foreach (TocItem i in toc)
{
// print the table name
Console.WriteLine(i.Text);
// extract a table content as a text
using (TextReader reader = parser.GetText(i.PageIndex.Value))
{
Console.WriteLine(reader.ReadToEnd());
}
}
}
// create an instance of Parser class
using (Parser parser = new Parser(Constants.SampleZip))
{
// extract images from document
IEnumerable<PageImageArea> images = parser.GetImages();
// check if images extraction is supported
if (images == null)
{
Console.WriteLine("Page images extraction isn't supported");
return;
}
// create the options to save images in PNG format
ImageOptions options = new ImageOptions(ImageFormat.Png);
int imageNumber = 0;
// iterate over images
foreach (PageImageArea image in images)
{
// save the image to the png file
image.Save(imageNumber.ToString() + ".png", options);
imageNumber++;
}
}
Home | Product Page | Documentation | Demo | API Reference | Examples | Blog | Search | Free Support | Temporary License