Closed cjdanbjorg closed 3 years ago
Hi @cjdanbjorg,
Yes, Facebook uses a "Read more" when the text is very long. Had the same problem but solved with cookies. Read https://github.com/kevinzg/facebook-scraper/issues/28#issuecomment-793066983 so you can implement it.
What version are you using? Both of your examples worked fine for me with 0.2.26:
urls = ["AlexLiberalAlliance/posts/1645625405645276", "AlexLiberalAlliance/posts/1627179514156532"]
posts = get_posts(post_urls=urls)
print([len(post["text"]) for post in posts])
outputs
[2285, 4496]
What version are you using? Both of your examples worked fine for me with 0.2.26:
urls = ["AlexLiberalAlliance/posts/1645625405645276", "AlexLiberalAlliance/posts/1627179514156532"] posts = get_posts(post_urls=urls) print([len(post["text"]) for post in posts])
outputs
[2285, 4496]
I was on 0.2.24 (the default for pip install), so I upgraded to 0.2.26 and had no issue - exactly as you pointed out - thanks :-)
I've noted the possible issue with cookies and assume that using the cookieoption might also be a solution. However, that seems to demand a little tweeking on my site also, so I will continue with 0.2.26 for now.
Thanks again
I have been scraping some 180+ pages for months with facebook_scraper and having the occasional issue here and there - no problem.
Now I do however notice a significant incidents where it returns
post_text = NoneType object
, with no apparent reason.I fail to see what would be causing this, I have attached two examples which are both lengthy and combines text with a picture, but I guess neither should be an issue.
Two example that return NoneType: https://www.facebook.com/AlexLiberalAlliance/posts/1645625405645276 https://facebook.com/AlexLiberalAlliance/posts/1627179514156532
In this same retrieval I collected other posts from the same page, without issues, one example is: https://facebook.com/AlexLiberalAlliance/posts/1636813539859796
My parameters are as follows:
for post in get_posts(targetusers[key],timeout=10,pages=3,extra_info=True,)
Any ideas?