Closed mehrvarz closed 4 months ago
@mehrvarz can you post a reproducible example?
Are these headings inside links? If yes, here is reason:
If you want a heading inside a link, that does not work. While the # heading is a block element, the [link](href) is an inline element. And it is invalid to have block elements inside inline elements. https://html-to-markdown.com/docs/heading-in-link
The best alternative is rendering the heading as bold text instead (see source).
Thank you for your response. There are two anchor elements right in front of the <h3>
element. But they are closed. Can they still influence the <h3>
element in such a way? (I have added some newlines to this 3rd party HTML for clarity:)
<?xml version='1.0' encoding='utf-8'?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<link href="content.78.css" rel="stylesheet" type="text/css"/>
<title>Some title</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>
<body style="background-color: #ffffff;">
<div>
<a id="d15e12847"/>
<a id="navpoint.d15e11188"/>
<h3 class="p_l-h3">Header text</h3>
Edit: This markup comes in a file that has .xhtml in it's file name.
Holla! I can fix the issue by dynamically converting all <a id="..."/>
to <a id="..."></a>
in a preprocessing step. Feels a little expensive, but it may be cheaper than a full Parse and Render. Need to think about it...
Ideally, your code would act differently based on xmlns. Right?
For now, I can live with this. Thank you very much!
// convert "<a .../>" to "<a ...></a>"
idxAll := 0
idxAnchor := strings.Index(htmlStr,"<a ")
for idxAnchor>=0 {
idxAll = idxAll + idxAnchor
idxCloseAnchor := strings.Index(htmlStr[idxAll+3:],">")
if idxCloseAnchor>=0 {
idxAll = idxAll + 3 + idxCloseAnchor
if htmlStr[idxAll+3+idxCloseAnchor-1] == '/' {
htmlStr = htmlStr[:idxAll+3+idxCloseAnchor-1] + "></a>" +
htmlStr[idxAll+3+idxCloseAnchor+1:]
idxAll = idxAll + 3
}
}
idxAnchor = strings.Index(htmlStr[idxAll:],"<a ")
}
HTML input
<h3>Heading</h3>
should generate### Heading.
And it usually does.But sometimes I see
**Heading**
being generated instead. What could be causing this?