How to parse real-time chromium pages

sarperavci / CloudflareBypassForScraping

A cloudflare verification bypass script for webscraping

342 stars 61 forks source link

How to parse real-time chromium pages #2

Closed Kghostpassby closed 5 months ago

Kghostpassby commented 5 months ago

I've used your program and he did help me bypass cloudflare's panel validation, which I must say is amazing, but I'm having a problem now, I want to rely on the program to parse the html of the site, but it seems that many third party libraries don't support parsing chromium, so if you have a solution to this problem, please be sure to let me know.

Ps. My English is not very good, so this article is translated by google, sorry if I have offended you!

sarperavci commented 5 months ago

Hi, You can get the page content with the html attribute.

html_content = driver.html

Than you can process the html with the beautifulsoup library.

For further information, you can always visit the very detailed documentation page of the library. (use translate)

Kghostpassby commented 5 months ago

Wow. You've done me a huge favor. Thank you.

sarperavci commented 5 months ago

Seems like the issue is solved. I'm closing the thread.