kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.36k stars 627 forks source link

If a post has more than one image -> I receive only one low-quality image #203

Open Medevin opened 3 years ago

Medevin commented 3 years ago

If a post has only one image - > I receive one full quality image. If a post has more than one image -> I receive only one low-quality image.

pmdscully commented 3 years ago

Interesting. I have a similar issue with only single image posts, yet, it might be relevant to your issue.

Additional info: edit 26/04

Possibly relevant issue #6

Tim1702 commented 3 years ago

For me, I receive full resolution image on my local machine but receive low quality image from my Heroku server.

Medevin commented 3 years ago

For me, I receive full resolution image on my local machine but receive low quality image from my Heroku server.

Could mobile layout be different for different regions? Maybe you can turn on logs and show info that is parsed from the group page from both local and Heroku?

pmdscully commented 3 years ago

FYI - For low quality images, as of commit d072d7c2050e899b0f1c423d5ce92473056e05a9 the code will empty the image and images fields and only show low quality images in the image_lowquality field.

Related to issue #217 and #213.

neon-ninja commented 3 years ago

@medevin @pmdscully which post are you having this problem with?

Medevin commented 3 years ago

@neon-ninja Any post from this group "saltandpepper.art" which has more than one image, just checked it with the latest master. Here is an example of the post with 4 images: {'available': True, 'comments': 8, 'comments_full': None, 'factcheck': None, 'image': None, 'image_lowquality': 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p228x119/179826016_2808295396167492_7551353435105857839_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=07e735&_nc_ohc=Qn9EdHPxwXsAX-3FaYD&_nc_ht=scontent-iev1-1.xx&tp=3&oh=90e8234262a26e4b346d49c30ad9bb54&oe=60B0C4EB', 'images': [None, None, None, None], 'is_live': False, 'likes': 68, 'link': None, 'post_id': '4037518582995301',...}

neon-ninja commented 3 years ago

@Medevin are you using cookies? If so, what is your language set to?

neon-ninja commented 3 years ago

For me, I receive full resolution image on my local machine but receive low quality image from my Heroku server.

Where is your heroku server located geographically?

Medevin commented 3 years ago

@neon-ninja I tried with cookies but as for me, we need a more detailed description or example. I made jar(with requests.cookies.cookiejar_from_dict) from cookies I got from facebook with these fields ['c_user', 'datr', 'dpr', 'fr', 'presence', 'sb', 'spin', 'wd', 'xs'] And there's no "language" field. As the result: many images - only one low_quality img; one image - only one low_quality img;

If I use just auth that fails: many images - only one low_quality img; one image - one hight_quality img;

pmdscully commented 3 years ago

Here is an example of the post with 4 images: 'images': [None, None, None, None]

@Medevin thanks for confirming with an example.

... ahh, for a multi-image post (i.e. 4 images), four none's is a lot less useful than four low_quality_image_urls...

@neon-ninja Perhaps others can confirm whether this issue has a wide effect, before considering rolling-back that commit d072d7c2050e899b0f1c423d5ce92473056e05a9 -> image=none and images=none..

neon-ninja commented 3 years ago

@pmdscully the HTML in Medevin's example is different to the other gallery type posts I've seen so far (made by a page or profile, not a group post). That commit is likely unrelated.

neon-ninja commented 3 years ago

@pmdscully https://github.com/kevinzg/facebook-scraper/commit/bb8c9f073c630a047d93d0e58c8b563005a9a174 should make it possible to extract up to 4 low quality image URLs

neon-ninja commented 3 years ago

and this commit (https://github.com/kevinzg/facebook-scraper/commit/c6e5b3a89d1276c4a38f2967fb38bc8eb1de1119) should make it possible to extract photoset style image galleries like https://m.facebook.com/groups/saltandpepper.art/permalink/4037518582995301/ - @Medevin please give it a try

Medevin commented 3 years ago

@neon-ninja Great, it works! There's a small improvement that could be done: I've pointed with '!---->' image that is settings icon and it is parsed like the first gallery image. PS: if you need the HTML of the post which parser receives, let me know.

{'available': True, 'comments': 61, 'comments_full': None, 'factcheck': None, 'image': 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164075181_892655458198343_4051013771165068554_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=2SJ8ISVoCS0AX856NIw&_nc_ht=scontent-iev1-1.xx&tp=14&oh=734211c587343de37bb4e68f4d7427fb&oe=60B49D2B&manual_redirect=1', !----> 'image_lowquality': 'https://static.xx.fbcdn.net/rsrc.php/v3/yX/r/u_9Swo8wb5U.png', 'images': ['https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164075181_892655458198343_4051013771165068554_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=2SJ8ISVoCS0AX856NIw&_nc_ht=scontent-iev1-1.xx&tp=14&oh=734211c587343de37bb4e68f4d7427fb&oe=60B49D2B&manual_redirect=1', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164535824_892655498198339_6851508125407273346_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=9KpBvpq4IsUAX9ZdbwA&_nc_ht=scontent-iev1-1.xx&tp=14&oh=c94cbb35bccbdbf1852ea5780fb5372c&oe=60B70AC9&manual_redirect=1', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164064887_892655541531668_3283686770127264306_n.jpg?_nc_cat=108&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=iiGjGdRu9z0AX-TXLrs&_nc_ht=scontent-iev1-1.xx&tp=14&oh=a6cf9b80bae46b631eb5f521377a5520&oe=60B51B6F&manual_redirect=1', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164600725_892655588198330_7187632791871277637_n.jpg?_nc_cat=111&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=CPqPnDssTf8AX-nyPkh&tn=Gvd6Z9bJT7ybSrvK&_nc_ht=scontent-iev1-1.xx&tp=14&oh=fb5846bac3df06a4cf21798ade39fcb3&oe=60B5AB41&manual_redirect=1'], !----> 'images_lowquality': ['https://static.xx.fbcdn.net/rsrc.php/v3/yX/r/u_9Swo8wb5U.png', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s320x320/164075181_892655458198343_4051013771165068554_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=2SJ8ISVoCS0AX856NIw&_nc_ht=scontent-iev1-1.xx&tp=9&oh=5212505f17cfb19e9bb38c4279bcfbd5&oe=60B43FB0', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p110x80/164535824_892655498198339_6851508125407273346_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=9KpBvpq4IsUAX9ZdbwA&_nc_ht=scontent-iev1-1.xx&tp=3&oh=4ee6ccd7ac6f4aa55eb2920d4fbdacf6&oe=60B5B16D', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p110x80/164064887_892655541531668_3283686770127264306_n.jpg?_nc_cat=108&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=iiGjGdRu9z0AX-TXLrs&_nc_ht=scontent-iev1-1.xx&tp=3&oh=e6c621818d2cfd259a2f2d2971f86612&oe=60B4284B', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p110x80/164600725_892655588198330_7187632791871277637_n.jpg?_nc_cat=111&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=CPqPnDssTf8AX-nyPkh&tn=Gvd6Z9bJT7ybSrvK&_nc_ht=scontent-iev1-1.xx&tp=3&oh=2827ac4ca13fd0e1ca40946c0cbe80ef&oe=60B431E5'], 'is_live': False, 'likes': 120, 'link': None}

neon-ninja commented 3 years ago

@Medevin which post are you having this problem with?

Medevin commented 3 years ago

@neon-ninja this one: https://m.facebook.com/groups/saltandpepper.art/permalink/3924981784248982/ but seems all posts with more than one image in a gallery

neon-ninja commented 3 years ago

@Medevin https://github.com/kevinzg/facebook-scraper/commit/62784a8d159f2ad22d58723c569d3fc79ae542bf should fix this problem, give it a try

Medevin commented 3 years ago

@neon-ninja works, thanks)