pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.62k stars 17.57k forks source link

Avoid empty lines with spaces to be transformed to empty string #59155

Open ritwizsinha opened 3 days ago

ritwizsinha commented 3 days ago
Aloqeely commented 3 days ago

Thanks for the PR! I'm not sure if this fixes the problem in the linked issue. Can you write a test that asserts the result of read_html on '<table><tr><td> </td></tr></table>' is not an empty list?

ritwizsinha commented 1 day ago

@Aloqeely addressed your comments and removed the new named argument

Aloqeely commented 12 hours ago

Are there any implications of passing skip_blank_lines=False as the default now? I'm sure that would break some existing code. To be quite frank I'm not very familiar with the read_html code, ping @mroeschke

ritwizsinha commented 4 hours ago

Are there any implications of passing skip_blank_lines=False as the default now? I'm sure that would break some existing code. To be quite frank I'm not very familiar with the read_html code, ping @mroeschke

The difference now would be that, every line with only spaces would be included as a new row in the DataFrame