One frustration of using the XML package to scrape web data through command line is the tedious try-and-error process, which is due to the fact that we do not have an intuitive overview of the HTML page. For example, we know there are a few tables in the page, but we cannot see which table is the one that we really want quickly; usually we have to dig deep into the long HTML code and compare them with R output.
A GUI wrapper can alleviate the problem. The recursive structure can be easily visualized by a tree, e.g. below is Dr Hofmann's homepage:
The above tree may enable us to find out the expected elements quickly.
The other useful interface is the table extractor using readHTMLTable().
A GUI for XML
One frustration of using the XML package to scrape web data through command line is the tedious try-and-error process, which is due to the fact that we do not have an intuitive overview of the HTML page. For example, we know there are a few tables in the page, but we cannot see which table is the one that we really want quickly; usually we have to dig deep into the long HTML code and compare them with R output.
A GUI wrapper can alleviate the problem. The recursive structure can be easily visualized by a tree, e.g. below is Dr Hofmann's homepage:
The above tree may enable us to find out the expected elements quickly.
The other useful interface is the table extractor using
readHTMLTable()
.Interactive Tile/Hexagon Plots