brutalsavage / facebook-post-scraper

Facebook Post Scraper 🕵️🖱️
GNU General Public License v3.0
324 stars 116 forks source link

Scraping comments doesn't work correctly #14

Open shaklev opened 4 years ago

shaklev commented 4 years ago

Looking at the following code segment:

        cmmBtn = browser.find_elements_by_xpath('//a[@class="_3hg- _42ft"]')
        for btn in cmmBtn:
            try:
                btn.click()
            except:
                pass
        time.sleep(1)
        moreCmm= browser.find_elements_by_xpath('//a[@class="_4sxc _42ft"]')
        for moreCmmBtn in moreCmm:
            try:
                moreCmmBtn.click()
            except:
                pass
        moreComments = browser.find_elements_by_xpath('//a[@class="_6w8_"]')

When you try to get all the "X comments" button ( with the line cmmBtn = browser.find_elements_by_xpath('//a[@class="_3hg- _42ft"]') ) and later in the for loop when you click them all, notice that if a post already has listed few comments (on page load before clicking X comments button - look picture ), than by clicking the button with class _3hg- _42ft you basically hide all the comments from that post.

There needs to be added additional checking to see if there already exists a _4sxc _42ft class whiting the post div ( meaning the view more comments button is shown = the _3hg- _42ft button doesn't need to be clicked )

Image

MatteoSerafino commented 3 years ago

Is this bug fixed?

shaklev commented 3 years ago

Is this bug fixed?

There needs to be added additional checking to see if there already exists a _4sxc _42ft class whiting the post div ( meaning the view more comments button is shown = the _3hg- _42ft button doesn't need to be clicked )

I wrote this as a simple solution back then, you can test it ( the principle should be the same now )