nleroy917 / textractor

A simple text extractor for various files. Includes core functionality for extracting text from files, a command-line interface, restful API, and python bindings.
1 stars 0 forks source link

Add HTML parsing #2

Open nleroy917 opened 2 months ago

nleroy917 commented 2 months ago

This would be useful for extract raw (and disorganized) text from web pages