hupili / python-for-data-and-media-communication-gitbook

An open source book on Python tailed for communication students with zero background
115 stars 62 forks source link

how to jump to next page in centaline property #142

Open ivywze opened 5 years ago

ivywze commented 5 years ago

Troubleshooting

Describe your environment

Describe your question

Cannot jump to the next page in centaline property, attached with the pageing source.

screenshot 2019-02-26 at 13 14 10

Describe the efforts you have spent on this issue

The URL of different pages doesn't change, also I tried to treat the '下一頁' as a button and try to click but failed.

click as a button

python:

data = []

while True:
    try:
        loadMoreButton = browser.find_element_by_class_name("pagingNext pagingA")
        time.sleep(2)
        loadMoreButton.click()
        time.sleep(5)
    except Exception as e:
        print(e)
        break

data.extend(get_articles_from_browser())

print("Complete")
# time.sleep(10)
browser.quit()
hupili commented 5 years ago

can you print the “loadMoreButton”? See if the element is correctly located and selected.

hupili commented 5 years ago

more details helps, eg the content of the element, its location, etc. Also, try the find elements (plural) version of the methods to see if more elements matched the CSS selector. Sometimes, you matched more. So the script clicked some other elements.

ivywze commented 5 years ago

can you print the “loadMoreButton”? See if the element is correctly located and selected.

It yields invalid selector: Compound class names not permitted, I follow the trick here solved this problem.

ivywze commented 5 years ago

But another error occured NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10d9766a0>: Failed to establish a new connection: [Errno 61] Connection refused'.

I am wondering, instead of trying to click the button, is there other way to go to next page?

hupili commented 5 years ago

You can also try to to conduct network trace analysis —- some cases in our “advanced scraping” chapter, the open book

hupili commented 5 years ago

@ivyWANG958 , this is the section on network trace analysis: https://github.com/hupili/python-for-data-and-media-communication-gitbook/blob/master/notes-week-08.md#analyse-network-traces