Improve performance - Githubissues

r-lib / xml2

Bindings to libxml2

Other

220 stars 82 forks source link

I work with XML files constantly and ran into this exact issue earlier this year as well. XML2 takes roughly a minute to extract data from a ~350kb-1.5mb xml file into a dataframe. For comparison I can process 600 files in the same amount of time by reading the file as a single column table with fread(), reformatting each row with stringr, flattening the table to a JSON string, converting it to a json and then back to a table, and then going through a series of unnest_wider and unnest_longer operations to populate parent data to child nodes.

r-lib / xml2

Improve performance #394