Open Medevin opened 3 years ago
Interesting. I have a similar issue with only single image posts, yet, it might be relevant to your issue.
get_posts()
on a Google Colab server, the image
(or images
) url* served is full resolution on https://scontent-sea1-1.xx.fbcdn.net. Additionally, likes | comments
columns are filled out./320x320/
is specified within the URL, yet modifying it will break the link). Here, likes | comments
columns are zero.post_timestamp
or its position within the feed (e.g. page_count
position), tend to return the full resolution images, where older posts tend to return the low res images.Additional info: edit 26/04
get_posts('page', pages=10)
I tend to receive higher quality images overall, compared to pages=[20,30,40]
.Possibly relevant issue #6
For me, I receive full resolution image on my local machine but receive low quality image from my Heroku server.
For me, I receive full resolution image on my local machine but receive low quality image from my Heroku server.
Could mobile layout be different for different regions? Maybe you can turn on logs and show info that is parsed from the group page from both local and Heroku?
FYI - For low quality images, as of commit d072d7c2050e899b0f1c423d5ce92473056e05a9 the code will empty the image
and images
fields and only show low quality images in the image_lowquality
field.
Related to issue #217 and #213.
@medevin @pmdscully which post are you having this problem with?
@neon-ninja
Any post from this group "saltandpepper.art" which has more than one image, just checked it with the latest master.
Here is an example of the post with 4 images:
{'available': True, 'comments': 8, 'comments_full': None, 'factcheck': None, 'image': None, 'image_lowquality': 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p228x119/179826016_2808295396167492_7551353435105857839_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=07e735&_nc_ohc=Qn9EdHPxwXsAX-3FaYD&_nc_ht=scontent-iev1-1.xx&tp=3&oh=90e8234262a26e4b346d49c30ad9bb54&oe=60B0C4EB', 'images': [None, None, None, None], 'is_live': False, 'likes': 68, 'link': None, 'post_id': '4037518582995301',...}
@Medevin are you using cookies? If so, what is your language set to?
For me, I receive full resolution image on my local machine but receive low quality image from my Heroku server.
Where is your heroku server located geographically?
@neon-ninja I tried with cookies but as for me, we need a more detailed description or example. I made jar(with requests.cookies.cookiejar_from_dict) from cookies I got from facebook with these fields ['c_user', 'datr', 'dpr', 'fr', 'presence', 'sb', 'spin', 'wd', 'xs'] And there's no "language" field. As the result: many images - only one low_quality img; one image - only one low_quality img;
If I use just auth that fails: many images - only one low_quality img; one image - one hight_quality img;
Here is an example of the post with 4 images: 'images': [None, None, None, None]
@Medevin thanks for confirming with an example.
... ahh, for a multi-image post (i.e. 4 images), four none
's is a lot less useful than four low_quality_image_urls
...
@neon-ninja Perhaps others can confirm whether this issue has a wide effect, before considering rolling-back that commit d072d7c2050e899b0f1c423d5ce92473056e05a9 -> image=none
and images=none
..
@pmdscully the HTML in Medevin's example is different to the other gallery type posts I've seen so far (made by a page or profile, not a group post). That commit is likely unrelated.
@pmdscully https://github.com/kevinzg/facebook-scraper/commit/bb8c9f073c630a047d93d0e58c8b563005a9a174 should make it possible to extract up to 4 low quality image URLs
and this commit (https://github.com/kevinzg/facebook-scraper/commit/c6e5b3a89d1276c4a38f2967fb38bc8eb1de1119) should make it possible to extract photoset style image galleries like https://m.facebook.com/groups/saltandpepper.art/permalink/4037518582995301/ - @Medevin please give it a try
@neon-ninja Great, it works! There's a small improvement that could be done: I've pointed with '!---->' image that is settings icon and it is parsed like the first gallery image. PS: if you need the HTML of the post which parser receives, let me know.
{'available': True, 'comments': 61, 'comments_full': None, 'factcheck': None, 'image': 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164075181_892655458198343_4051013771165068554_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=2SJ8ISVoCS0AX856NIw&_nc_ht=scontent-iev1-1.xx&tp=14&oh=734211c587343de37bb4e68f4d7427fb&oe=60B49D2B&manual_redirect=1', !----> 'image_lowquality': 'https://static.xx.fbcdn.net/rsrc.php/v3/yX/r/u_9Swo8wb5U.png', 'images': ['https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164075181_892655458198343_4051013771165068554_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=2SJ8ISVoCS0AX856NIw&_nc_ht=scontent-iev1-1.xx&tp=14&oh=734211c587343de37bb4e68f4d7427fb&oe=60B49D2B&manual_redirect=1', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164535824_892655498198339_6851508125407273346_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=9KpBvpq4IsUAX9ZdbwA&_nc_ht=scontent-iev1-1.xx&tp=14&oh=c94cbb35bccbdbf1852ea5780fb5372c&oe=60B70AC9&manual_redirect=1', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164064887_892655541531668_3283686770127264306_n.jpg?_nc_cat=108&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=iiGjGdRu9z0AX-TXLrs&_nc_ht=scontent-iev1-1.xx&tp=14&oh=a6cf9b80bae46b631eb5f521377a5520&oe=60B51B6F&manual_redirect=1', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-9/fr/cp0/e15/q65/164600725_892655588198330_7187632791871277637_n.jpg?_nc_cat=111&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=CPqPnDssTf8AX-nyPkh&tn=Gvd6Z9bJT7ybSrvK&_nc_ht=scontent-iev1-1.xx&tp=14&oh=fb5846bac3df06a4cf21798ade39fcb3&oe=60B5AB41&manual_redirect=1'], !----> 'images_lowquality': ['https://static.xx.fbcdn.net/rsrc.php/v3/yX/r/u_9Swo8wb5U.png', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/s320x320/164075181_892655458198343_4051013771165068554_n.jpg?_nc_cat=110&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=2SJ8ISVoCS0AX856NIw&_nc_ht=scontent-iev1-1.xx&tp=9&oh=5212505f17cfb19e9bb38c4279bcfbd5&oe=60B43FB0', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p110x80/164535824_892655498198339_6851508125407273346_n.jpg?_nc_cat=107&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=9KpBvpq4IsUAX9ZdbwA&_nc_ht=scontent-iev1-1.xx&tp=3&oh=4ee6ccd7ac6f4aa55eb2920d4fbdacf6&oe=60B5B16D', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p110x80/164064887_892655541531668_3283686770127264306_n.jpg?_nc_cat=108&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=iiGjGdRu9z0AX-TXLrs&_nc_ht=scontent-iev1-1.xx&tp=3&oh=e6c621818d2cfd259a2f2d2971f86612&oe=60B4284B', 'https://scontent-iev1-1.xx.fbcdn.net/v/t1.6435-0/cp0/e15/q65/p110x80/164600725_892655588198330_7187632791871277637_n.jpg?_nc_cat=111&ccb=1-3&_nc_sid=07e735&efg=eyJpIjoidCJ9&_nc_ohc=CPqPnDssTf8AX-nyPkh&tn=Gvd6Z9bJT7ybSrvK&_nc_ht=scontent-iev1-1.xx&tp=3&oh=2827ac4ca13fd0e1ca40946c0cbe80ef&oe=60B431E5'], 'is_live': False, 'likes': 120, 'link': None}
@Medevin which post are you having this problem with?
@neon-ninja this one: https://m.facebook.com/groups/saltandpepper.art/permalink/3924981784248982/ but seems all posts with more than one image in a gallery
@Medevin https://github.com/kevinzg/facebook-scraper/commit/62784a8d159f2ad22d58723c569d3fc79ae542bf should fix this problem, give it a try
@neon-ninja works, thanks)
If a post has only one image - > I receive one full quality image. If a post has more than one image -> I receive only one low-quality image.