lexborisov / myhtml

Fast C/C++ HTML 5 Parser. Using threads.
GNU Lesser General Public License v2.1
1.66k stars 147 forks source link

fix serialization segfault #151

Closed Azq2 closed 6 years ago

Azq2 commented 6 years ago

Fix segfault if doctype hasn't attribute

Azq2 commented 6 years ago

@lexborisov i think its ready to merge

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">'
name: 'html'
publicId: '-//W3C//DTD XHTML 1.0 Strict//EN'
systemId: 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'

IN:  '<!DOCTYPE html>'
OUT: '<!DOCTYPE html>'
name: 'html'
publicId: ''
systemId: ''

IN:  '<!DOCTYPE html PUBLIC>'
OUT: '<!DOCTYPE html>'
name: 'html'
publicId: ''
systemId: ''

IN:  '<!DOCTYPE html SYSTEM>'
OUT: '<!DOCTYPE html>'
name: 'html'
publicId: ''
systemId: ''

IN:  '<!DOCTYPE html allala>'
OUT: '<!DOCTYPE html>'
name: 'html'
publicId: ''
systemId: ''

IN:  '<!DOCTYPE html "allala">'
OUT: '<!DOCTYPE html>'
name: 'html'
publicId: ''
systemId: ''

IN:  '<!doctype HTML system "about:legacy-compat">'
OUT: '<!DOCTYPE html SYSTEM "about:legacy-compat">'
name: 'html'
publicId: ''
systemId: 'about:legacy-compat'

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN">'
name: 'html'
publicId: '-//W3C//DTD HTML 4.0//EN'
systemId: ''

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">'
name: 'html'
publicId: '-//W3C//DTD HTML 4.0//EN'
systemId: 'http://www.w3.org/TR/REC-html40/strict.dtd'

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">'
name: 'html'
publicId: '-//W3C//DTD HTML 4.01//EN'
systemId: ''

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">'
name: 'html'
publicId: '-//W3C//DTD HTML 4.01//EN'
systemId: 'http://www.w3.org/TR/html4/strict.dtd'

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">'
name: 'html'
publicId: '-//W3C//DTD XHTML 1.0 Strict//EN'
systemId: 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'

IN:  '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">'
OUT: '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">'
name: 'html'
publicId: '-//W3C//DTD XHTML 1.1//EN'
systemId: 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'

IN:  '<!DOCTYPE html SYSTEM "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">'
OUT: '<!DOCTYPE html SYSTEM "-//W3C//DTD XHTML 1.1//EN">'
name: 'html'
publicId: ''
systemId: '-//W3C//DTD XHTML 1.1//EN'

IN:  '<!DOCTYPE OlOlLo>'
OUT: '<!DOCTYPE olollo>'
name: 'olollo'
publicId: ''
systemId: ''

IN:  '<!DOCTYPE html PUBLIC "" "xxx">'
OUT: '<!DOCTYPE html SYSTEM "xxx">'
name: 'html'
publicId: ''
systemId: 'xxx'

IN:  '<!DOCTYPE svg:svg PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">'
OUT: '<!DOCTYPE svg:svg PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN" "http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd">'
name: 'svg:svg'
publicId: '-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN'
systemId: 'http://www.w3.org/2002/04/xhtml-math-svg/xhtml-math-svg.dtd'
lexborisov commented 6 years ago

Thanks!