Open pvita opened 4 years ago
Hi pvita—
You're probably trying to scrape posts that have an external link but no link title. I suggest you wrap line #47 in a try / except pass statement to solve the issue like so:
43 > try: a_data['title'] = link_box.find_element_by_class_name('_52jh').text
44 > except: pass
What is the url of your target page? It's possible Facebook uses different page setups for different groups, which would be a bigger limitation to my algo
thank you Jz-fitz, i'm doing a test on this link I will let you know if your suggestion will work when it finish the process :)
Best
Bad news, same error :(
I've done another test on another group and there is a different error
additional content loaded: 950 total posts
scraping post data... data scraped: 0 posts quitting driver. parsing... Traceback (most recent call last): File "C:\Users\x\Desktop\gru.py", line 99, in
data = load_parse_save() # trigger scrape function File "C:\Users\x\Desktop\gru.py", line 69, in load_parse_save data['post_text'] = data.post_text.str.replace('JTM', '').apply( File "C:\Users\x\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\generic.py", line 5274, in getattr return object.getattribute(self, name) AttributeError: 'DataFrame' object has no attribute 'post_text'
This error:
'DataFrame' object has no attribute 'post_text'
is a result of the fact that the scraper picked up 0 posts (data scraped: 0 posts
) — please note that my script is specifically designed to scrape only posts that include some kind of external link (example). If a post contains no external link, it will be skipped by the scraper. You can change this by editing the section that checks for a link_box
, AKA the gray box Facebook generates for link thumbnails.
I recommend copying the code from my function into an interactive python environment such as Jupyter so you can test the process step-by-step to adapt it to your needs. If you start running the code line-by-line in an interactive environment, let me know where you get caught up and I can offer more help from there. Also see my post on Medium for a clear breakdown of the script and a glossary of attribute names and their corresponding element types.
Keep me updated on your progress and I will be happy to help along the way. Good luck!
Looking at both your links, it appears Facebook uses the same attribute glossary for pages in Italian, so you should be able to use the glossary I developed in my Medium post to suit your needs!
Hi there, I'm triyng to adopt your script but in the "scraping post data" phase i get this errors. Thank you for your work