kevinzg / facebook-scraper

Scrape Facebook public pages without an API key
MIT License
2.45k stars 633 forks source link

empty time #256

Open drbourbon opened 3 years ago

drbourbon commented 3 years ago

Hi, I am getting null time when scraping this: https://www.facebook.com/mente.aliena

Also test_parse_date.py fails on my host (debian 10 with custom built python 3.9).

Any clue on what I am doing wrong? Thks!

neon-ninja commented 3 years ago

Hi, I tried to replicate your problem with docker, and the docker command:

docker run -it python:3.9-buster bash -c "git clone https://github.com/kevinzg/facebook-scraper.git;cd facebook-scraper;pip install .;pip install -r requirements-dev.txt;pytest"

And the tests passed. Can you tell me more about your custom python build? Did you compile it from source? What version of facebook-scraper do you have?

neon-ninja commented 3 years ago

Also I tested

for post in get_posts("mente.aliena", cookies="cookies.txt"):
    print(post.get("post_id"), post.get("time"))

And got

1069137256629377 2019-05-27 16:33:00
1069136873296082 2019-05-27 16:33:00
1018820048327765 2019-03-05 20:17:00
1007125329497237 2019-02-14 22:19:00
1005709032972200 2019-02-12 22:26:00
1005708119638958 2019-02-12 22:24:00
997647693778334 2019-01-31 12:29:00
992936717582765 2019-01-23 21:59:00
988481711361599 2019-01-16 20:14:00
988480194695084 2019-01-16 20:10:00
988479854695118 2019-01-16 20:09:00
985260511683719 2019-01-11 15:46:00
985141048362332 2019-01-11 11:04:00
drbourbon commented 3 years ago

Here's my pytest output

=============================================== test session starts =============================================== platform linux -- Python 3.9.4, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 rootdir: /home/fabio/dev/xpaolo/facebook-scraper plugins: vcr-1.0.2 collected 7 items

tests/test_get_posts.py sssss [ 71%] tests/test_parse_date.py F [ 85%] tests/test_parse_duration.py . [100%]

==================================================== FAILURES ===================================================== __ TestParseDate.test_all_dates ___

self = <test_parse_date.TestParseDate object at 0x7f11a3f27220>

def test_all_dates(self):
    for date in self.dates:
        try:
            assert parse_datetime(date) is not None
        except AssertionError as e:
            print(f'Failed to parse {date}')
          raise e

tests/test_parse_date.py:49:


self = <test_parse_date.TestParseDate object at 0x7f11a3f27220>

def test_all_dates(self):
    for date in self.dates:
        try:
          assert parse_datetime(date) is not None

E AssertionError: assert None is not None E + where None = parse_datetime('Oct 1 at 1:00 PM')

tests/test_parse_date.py:46: AssertionError ---------------------------------------------- Captured stdout call ----------------------------------------------- Failed to parse Oct 1 at 1:00 PM ============================================= short test summary info ============================================= FAILED tests/test_parse_date.py::TestParseDate::test_all_dates - AssertionError: assert None is not None ===================================== 1 failed, 1 passed, 5 skipped in 2.03s ======================================

drbourbon commented 3 years ago

Same issue with debian 10 native python 3.7. BTW I ended up using the library with docker as you did to replicate the problem. Thank you @neon-ninja for your help!

neon-ninja commented 3 years ago

I tested with docker run -it python:3.7-buster bash -c "git clone https://github.com/kevinzg/facebook-scraper.git;cd facebook-scraper;pip install .;pip install -r requirements-dev.txt;pytest" and the tests passed. Are you saying you ran the exact same docker command and got a different result? That shouldn't happen™.

What's your output of the command pip freeze?