ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
220 stars 73 forks source link

Saving HTML tables (rvest) as Excel files #450

Open Mkranj opened 9 months ago

Mkranj commented 9 months ago

I'm downloading a certain HTML table using the rvest package. Currently, I'm transforming it to a regular dataframe and then saving it as .xlsx. However, the table in question has a lot of merged cells. When transforming to a dataframe, all the spaces a merged cell occupies get filled with its text, leading to many duplicates. Is there a way to directly save a HTML table as an Excel file? Since Excel and openxlsx support merged cells, this would lead to a true-to-original output. I believe this would be a very useful feature :) From what I've tried, the rvest table is in a xml_node format.

Thanks for the great work!