rubycdp / ferrum

Headless Chrome Ruby API
https://ferrum.rubycdp.com
MIT License
1.71k stars 123 forks source link

Browser Isn't Loading Images Nor Autoload Records on :page_down #252

Closed daBee closed 2 years ago

daBee commented 2 years ago

I have a browser that is misbehaving.

pagelimit = 5 pagelimit.times do a.send_keys :page_down sleep 6 end

a.close



I've done this many times, inserted timers/delays, and it simply won't load.  Not sure what's going on, nor how to troubleshoot this.  Seems as if it's just not responding.  

Just tried it again and the records on that page didn't load at all.  
Mifrill commented 2 years ago

I don't see in the issue description any kind of references to Ferrum-related things at all.

In this particular case, we have Watir that uses Selenium to manage Chrome - so, there is nothing about Ferrum.

The problematic point for this source is crawler protection. To put it simply, when we trying to parse the webpage than that closes itself and prevent any other requests:

Access Denied
You don't have permission to access "http://www.justdial.com/ca/NS-Halifax/Grocery-Stores" on this server.
Reference #...

I'd recommend using API parsing here, for example, here is the link: https://www.justdial.com/ca/data/result/getdata?uri=result&city=Halifax&search=Grocery-Stores&sortBy=&state=NS&page=page-4&v=9.23 that we can paginate by page=page-4 query param to fetch all needed data.

So, I'll close this one, as no issue here.

daBee commented 2 years ago

Sorry about that.

daBee commented 2 years ago

How did you find that? There's no mention of any API.

Mifrill commented 2 years ago

@daBee

How did you find that?

Peek 2022-04-04 21-28

daBee commented 2 years ago

OK I thought something more was up. That URL failed with the same later, so I think they're counting hits. And this was during testing as well.