matthewmueller / x-ray

The next web scraper. See through the <html> noise.
MIT License
5.87k stars 349 forks source link

How to scrape using html files if the site did not declare any "class" #272

Closed jhnferraris closed 7 years ago

jhnferraris commented 7 years ago

Hello,

I'm trying to review on my javascript skills here and would like to try out this neat scraper. I have this static website here: http://www.phivolcs.dost.gov.ph/html/update_SOEPD/EQLatest.html, I'm trying to scrape off the 2017 table.

Comparing to HackerNews website, my target site doesn't have any css classes to target which texts to scrape.

See image:

screen shot 2017-09-01 at 3 49 32 pm

I saw in your readme that I can do manual targeting of the tags but I have some doubts on how to target the specific table (the highlighted one in the image).

Thanks for the assist!

jhnferraris commented 7 years ago

Got it now. I just used a CSS-selector

table:nth-of-type(3)