reubano / meza

A Python toolkit for processing tabular data
MIT License
416 stars 32 forks source link

test failure in test_excel_html_export with io.read_html #44

Open nieder opened 2 years ago

nieder commented 2 years ago

Testing meza-0.46.0, I get this error (py38-py310):

Test for reading an html table exported from excel ... FAIL

======================================================================
FAIL: Test for reading an html table exported from excel
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/sw/lib/python3.9/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/sw/build.build/meza-py39-0.46.0-1/meza-0.46.0/tests/test_io.py", line 354, in test_excel_html_export
    nt.assert_equal(expected, next(records))
AssertionError: {'sparse_data': 'Iñtërnâtiônàližætiøn', 'so[61 chars]dam'} != {'13_width_75_some_date': '13 class=xl24 al[123 chars]dam'}
- {'some_date': '05/04/82',
+ {'13_width_75_some_date': '13 class=xl24 align=right>05/04/82',
+  '2_width_150_unicode_test': 'Ādam',
-  'some_value': '234',
+  '75_some_value': 'right>234',
?   +++              ++++++

-  'sparse_data': 'Iñtërnâtiônàližætiøn',
?                                       ^

+  '75_sparse_data': 'Iñtërnâtiônàližætiøn'}
?   +++                                    ^

-  'unicode_test': 'Ādam'}

----------------------------------------------------------------------

The output in the AssertionError line seems all mangled with the attributes from the different html table elements sprinkled in. If I remove the html attributes for the table in data/test/test.htm, then the test passes. I notice that io.read_html uses BeautifulSoup. I have beautifulsoup-4.10.0 and soupsieve-2.3.1 installed.