html-parse Search Results

StarlightSearch/EmbedAnything #92

Split HTML parsing from website parsing

It seems like this library is capable of parsing HTML, but right now it'll only allow you to do so through a URL. It'd be cool if HTML parsing was something we could do separately, instead of requi…

boswelja updated 5 hours ago

golang/go #70275

proposal: x/net/html: parser hooks

### Proposal Details The following is not something I _need_ right now, but my use case made me think that there is a valid reason for the code calling `html.Parse` to be able to influence the pars…

stroiman updated 3 days ago

unclecode/crawl4ai #253

Support for Direct HTML Parsing in crawl4ai

I have a specific use case where Cloudflare blocks often prevent successful crawling, making it challenging to bypass with `crawl4ai`. To handle this, we tried using [flare-bypasser](https://github.co…

crelocks updated 1 day ago

robotframework/robotframework #5256

Add `Parse HTML` keyword to XML Library

I would like to suggest adding a `Parse HTML` keyword to XML Library. **Why:** - I have a need to test the html output from an application that output's a html file to the local file system - I f…

damies13 updated 3 days ago

docwire/docwire #156

Table caption HTML tag stops parsing (Exception thrown conve…

The following HTML file causes an exception to be thrown in 2024.10.15 plain_text_writer.cpp, line 400: throw_if (table.empty(), "Cell content inside table without rows"); [Dataset Overview…

efieleke-tausight updated 3 weeks ago

Unstructured-IO/unstructured #3697

bug/certain htmls cannot be parsed

**Describe the bug** Certain HTML files scraped from GCP docs like the following URLs return empty elements or elements with simply newline characters when using `partition_html`. **To Reproduce…

AraiYuno updated 2 weeks ago

EmranMR/tree-sitter-blade #78

Blade comment got override with html parser

I did follow every guide to setup injections and highlights in the discussion here #19. But when i try to do comment with shortcut in neovim, it did the html one and not blade. ![Screenshot from 2024…

ahmadmuqri0 updated 2 weeks ago

agiacalone/jargonfile #3

HTML to text parser

Create a source html to text parser, so we can easily create text-versions of the File after every update. Very useful for gopher, gemini, and terminal reading.

agiacalone updated 3 days ago

my8100/scrapydweb #247

HTML Parsing Failure on Jobs Page when Using Python 3.11.9

Description: When running Scrapyd with Python 3.11.9, the format of the HTML returned on the Jobs page appears to be standardized in a way that prevents successful parsing of job data. This issue doe…

ybbzbb updated 1 hour ago

lutaml/lutaml-model #154

Parsing HTML entities in XML using Nokogiri as adapter

**Nokogiri** gem doesn’t handle **HTML** entities other than `&`, `` , `"` , and `'`, the rest of the entities are ignored/replaced, but they are valid input in **MathML**. Issue faced while MathML…

suleman-uzair updated 1 week ago

1000+ results for html-parse

1000+ results
for html-parse