5j9 / wikitextparser

A Python library to parse MediaWiki WikiText
GNU General Public License v3.0
289 stars 22 forks source link

fix(is_header): cells starting with | are not header-cells #77

Closed TrueBrain closed 4 years ago

TrueBrain commented 4 years ago
import wikitextparser

wtp = wikitextparser.parse("{|\n! Header\n| Not a header\n|}")
print(wtp.get_tables()[0].cells()[0][1].is_header)

Prints True, but it really is not a header (https://www.mediawiki.org/wiki/Help:Tables#Accessibility_of_table_header_cells).

The problem seems to be that currently the first cell in a row decides if a whole row should be a header or not (if I understand the code correctly). This is not true. I now changed the regex to tell per cell if it is a header or not.

There might already have been code on https://github.com/5j9/wikitextparser/blob/master/wikitextparser/_cell.py#L186 that is suppose to do this; but this code is never executed in the above example. So please check this PR with care :)

codecov[bot] commented 4 years ago

Codecov Report

Merging #77 into master will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master       #77   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           31        31           
  Lines         4260      4262    +2     
=========================================
+ Hits          4260      4262    +2     
Impacted Files Coverage Δ
wikitextparser/_cell.py 100.00% <ø> (ø)
tests/test_table.py 100.00% <100.00%> (ø)
wikitextparser/_table.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a595b29...deeb7c6. Read the comment docs.