elementor / wp2static

WordPress static site generator for security, performance and cost benefits
https://wp2static.com
The Unlicense
1.43k stars 270 forks source link

Messed up DOM after exporting #264

Closed azdanov closed 5 years ago

azdanov commented 5 years ago

After exporting blockquote html tag gets messed up. I am using Netlify export.

Here's the live website: https://wordpress-first.netlify.com/markup-html-tags-and-formatting/

Here are some screenshots:

Local:

screenshot 2019-01-24 at 10 00 56

Exported:

screenshot 2019-01-24 at 10 01 34
azdanov commented 5 years ago

Just a guess, but maybe the html parser thinks that The HTML <blockquote> Element... text contains real tags.

leonstafford commented 5 years ago

Many thanks for finding this, @azdanov!

I have to tidy up some unit tests today for HTML parser, so will try to get this fixed the same time.

leonstafford commented 5 years ago

@azdanov, in starting to look into this issue, I think this is more of an HTML validity issue, which when DOMParsing will not cope with.

Suggested solution is to use HTML entities like &lt; BLOCKQUOTE &gt; to achieve < BLOCKQUOTE >

Please let me know if that's not working and will re-open

azdanov commented 5 years ago

Thanks for checking this out.

Suggested solution is to use HTML entities like < BLOCKQUOTE > to achieve < BLOCKQUOTE >

Will do!

But it's still odd. I was using Theme_Unit_Test to generate the content. Probably similar issues might occur in the future for other users. Maybe it's worth mentioning in a Troubleshooting section of the readme?

azdanov commented 5 years ago

Okay, still no luck.

screenshot 2019-01-26 at 17 55 57
leonstafford commented 5 years ago

@azdanov - willing to look into it further, but what's the current issue with that one?

I'd need the original source code from the WP site, the published code from the static output and ideally the 2 screenshots again. I'm not quite understanding the issue, sorry

azdanov commented 5 years ago

Original - Pastebin Static - Pastebin

Original ```html Markup: HTML Tags and Formatting – First

Markup: HTML Tags and Formatting

By

Headings

Header one

Header two

Header three

Header four

Header five
Header six

Blockquotes

Single line blockquote:

Stay hungry. Stay foolish.

Multi line blockquote with a cite reference:

The HTML <blockquote> Element (or HTML Block Quotation Element) indicates that the enclosed text is an extended quotation. Usually, this is rendered visually by indentation (see Notes for how to change it). A URL for the source of the quotation may be given using the cite attribute, while a text representation of the source can be given using the <cite> element.

multiple contributors – MDN HTML element reference – blockquote

Tables

Employee Salary
John Doe $1 Because that’s all Steve Jobs needed for a salary.
Jane Doe $100K For all the blogging she does.
Fred Bloggs $100M Pictures are worth a thousand words, right? So Jane x 1,000.
Jane Bloggs $100B With hair like that?! Enough said…

Definition Lists

Definition List Title
Definition list division.
Startup
A startup company or startup is a company or temporary organization designed to search for a repeatable and scalable business model.
#dowork
Coined by Rob Dyrdek and his personal body guard Christopher “Big Black” Boykins, “Do Work” works as a self motivator, to motivating your friends.
Do It Live
I’ll let Bill O’Reilly will explain this one.

Unordered Lists (Nested)

  • List item one
    • List item one
      • List item one
      • List item two
      • List item three
      • List item four
    • List item two
    • List item three
    • List item four
  • List item two
  • List item three
  • List item four

Ordered List (Nested)

  1. List item one -start at 8
    1. List item one
      1. List item one -reversed attribute
      2. List item two
      3. List item three
      4. List item four
    2. List item two
    3. List item three
    4. List item four
  2. List item two
  3. List item three
  4. List item four

HTML Tags

These supported tags come from the WordPress.com code FAQ.

Address Tag

1 Infinite Loop
Cupertino, CA 95014
United States

Anchor Tag (aka. Link)

This is an example of a link.

Abbreviation Tag

The abbreviation srsly stands for “seriously”.

Acronym Tag (deprecated in HTML5)

The acronym ftw stands for “for the win”.

Big Tag (deprecated in HTML5)

These tests are a big deal, but this tag is no longer supported in HTML5.

Cite Tag

“Code is poetry.” —Automattic

Code Tag

This tag styles blocks of code.
.post-title {
margin: 0 0 5px;
font-weight: bold;
font-size: 38px;
line-height: 1.2;
and here's a line of some really, really, really, really long text, just to see how it is handled and to find out how it overflows;
}

You will learn later on in these tests that word-wrap: break-word;will be your best friend.

Delete Tag

This tag will let you strike out text, but this tag is recommended supported in HTML5 (use the <s> instead).

Emphasize Tag

The emphasize tag should italicize text.

Horizontal Rule Tag


This sentence is following a <hr /> tag.

Insert Tag

This tag should denote inserted text.

Keyboard Tag

This scarcely known tag emulates keyboard text, which is usually styled like the <code> tag.

Preformatted Tag

This tag is for preserving whitespace as typed, such as in poetry or ASCII art.

The Road Not Taken

Robert Frost Two roads diverged in a yellow wood, And sorry I could not travel both (\_/) And be one traveler, long I stood (=’.’=) And looked down one as far as I could (“)_(“) To where it bent in the undergrowth; Then took the other, as just as fair, And having perhaps the better claim, |\_/| Because it was grassy and wanted wear; / @ @ \ Though as for that the passing there ( > º < ) Had worn them really about the same, `>>x<<´ / O \ And both that morning equally lay In leaves no step had trodden black. Oh, I kept the first for another day! Yet knowing how way leads on to way, I doubted if I should ever come back. I shall be telling this with a sigh Somewhere ages and ages hence: Two roads diverged in a wood, and I— I took the one less traveled by, And that has made all the difference. and here’s a line of some really, really, really, really long text, just to see how it is handled and to find out how it overflows;

Quote Tag for short, inline quotes

Developers, developers, developers… –Steve Ballmer

Strike Tag (deprecated in HTML5) and S Tag

This tag shows strike-through text.

Small Tag

This tag shows smaller text.

Strong Tag

This tag shows bold text.

Subscript Tag

Getting our science styling on with H2O, which should push the “2” down.

Superscript Tag

Still sticking with science and Albert Einstein’s E = MC2, which should lift the 2 up.

Teletype Tag (obsolete in HTML5)

This rarely used tag emulates teletype text, which is usually styled like the <code> tag.

Underline Tag deprecated in HTML 4, re-introduced in HTML5 with other semantics

This tag shows underlined text.

Variable Tag

This allows you to denote variables.

```
Static ```html Markup: HTML Tags and Formatting – First

Markup: HTML Tags and Formatting

By

Headings

Header one

Header two

Header three

Header four

Header five
Header six

Blockquotes

Single line blockquote:

Stay hungry. Stay foolish.

Multi line blockquote with a cite reference:

The HTML

Element (or HTML Block Quotation Element) indicates that the enclosed text is an extended quotation. Usually, this is rendered visually by indentation (see Notes for how to change it). A URL for the source of the quotation may be given using the cite attribute, while a text representation of the source can be given using the element.

multiple contributors – MDN HTML element reference – blockquote

Tables

Employee Salary
John Doe $1 Because that’s all Steve Jobs needed for a salary.
Jane Doe $100K For all the blogging she does.
Fred Bloggs $100M Pictures are worth a thousand words, right? So Jane x 1,000.
Jane Bloggs $100B With hair like that?! Enough said…

Definition Lists

Definition List Title
Definition list division.
Startup
A startup company or startup is a company or temporary organization designed to search for a repeatable and scalable business model.
#dowork
Coined by Rob Dyrdek and his personal body guard Christopher “Big Black” Boykins, “Do Work” works as a self motivator, to motivating your friends.
Do It Live
I’ll let Bill O’Reilly will explain this one.

Unordered Lists (Nested)

  • List item one
    • List item one
      • List item one
      • List item two
      • List item three
      • List item four
    • List item two
    • List item three
    • List item four
  • List item two
  • List item three
  • List item four

Ordered List (Nested)

  1. List item one -start at 8
    1. List item one
      1. List item one -reversed attribute
      2. List item two
      3. List item three
      4. List item four
    2. List item two
    3. List item three
    4. List item four
  2. List item two
  3. List item three
  4. List item four

HTML Tags

These supported tags come from the WordPress.com code FAQ.

Address Tag

1 Infinite Loop
Cupertino, CA 95014
United States

Anchor Tag (aka. Link)

This is an example of a link.

Abbreviation Tag

The abbreviation srsly stands for “seriously”.

Acronym Tag (deprecated in HTML5)

The acronym ftw stands for “for the win”.

Big Tag (deprecated in HTML5)

These tests are a big deal, but this tag is no longer supported in HTML5.

Cite Tag

“Code is poetry.” —Automattic

Code Tag

This tag styles blocks of code.
.post-title {
margin: 0 0 5px;
font-weight: bold;
font-size: 38px;
line-height: 1.2;
and here's a line of some really, really, really, really long text, just to see how it is handled and to find out how it overflows;
}

You will learn later on in these tests that word-wrap: break-word;will be your best friend.

Delete Tag

This tag will let you strike out text, but this tag is recommended supported in HTML5 (use the instead).

Emphasize Tag

The emphasize tag should italicize text.

Horizontal Rule Tag


This sentence is following a


tag.

Insert Tag

This tag should denote inserted text.

Keyboard Tag

This scarcely known tag emulates keyboard text, which is usually styled like the tag.

Preformatted Tag

This tag is for preserving whitespace as typed, such as in poetry or ASCII art.

The Road Not Taken

Robert Frost Two roads diverged in a yellow wood, And sorry I could not travel both (\_/) And be one traveler, long I stood (=’.’=) And looked down one as far as I could (“)_(“) To where it bent in the undergrowth; Then took the other, as just as fair, And having perhaps the better claim, |\_/| Because it was grassy and wanted wear; / @ @ \ Though as for that the passing there ( > º < ) Had worn them really about the same, `>>x<<´ / O \ And both that morning equally lay In leaves no step had trodden black. Oh, I kept the first for another day! Yet knowing how way leads on to way, I doubted if I should ever come back. I shall be telling this with a sigh Somewhere ages and ages hence: Two roads diverged in a wood, and I— I took the one less traveled by, And that has made all the difference. and here’s a line of some really, really, really, really long text, just to see how it is handled and to find out how it overflows;

Quote Tag for short, inline quotes

Developers, developers, developers… –Steve Ballmer

Strike Tag (deprecated in HTML5) and S Tag

This tag shows strike-through text.

Small Tag

This tag shows smaller text.

Strong Tag

This tag shows bold text.

Subscript Tag

Getting our science styling on with H2O, which should push the “2” down.

Superscript Tag

Still sticking with science and Albert Einstein’s E = MC2, which should lift the 2 up.

Teletype Tag (obsolete in HTML5)

This rarely used tag emulates teletype text, which is usually styled like the tag.

Underline Tag deprecated in HTML 4, re-introduced in HTML5 with other semantics

This tag shows underlined text.

Variable Tag

This allows you to denote variables.

```

In original the line 108 is <code>&lt;blockquote&gt;</code>.

screenshot 2019-01-26 at 20 20 57

In static the line 51 is <code><blockquote></code>

screenshot 2019-01-26 at 20 22 15

Also I've noticed that entities (\<) get transformed into characters (<). Maybe this is somewhat responsible?

As for screenshots, are any others needed?