yihui / ideas

personal activities
http://yihui.name
6 stars 3 forks source link

Stat585X project #27

Closed yihui closed 10 years ago

yihui commented 12 years ago

A GUI for XML

One frustration of using the XML package to scrape web data through command line is the tedious try-and-error process, which is due to the fact that we do not have an intuitive overview of the HTML page. For example, we know there are a few tables in the page, but we cannot see which table is the one that we really want quickly; usually we have to dig deep into the long HTML code and compare them with R output.

A GUI wrapper can alleviate the problem. The recursive structure can be easily visualized by a tree, e.g. below is Dr Hofmann's homepage:

The above tree may enable us to find out the expected elements quickly.

The other useful interface is the table extractor using readHTMLTable().

Interactive Tile/Hexagon Plots