newsviz / Spiders

Spiders and crawlers for news download
GNU General Public License v3.0
4 stars 8 forks source link

Исключение при сборе meduza #12

Closed stroykova closed 3 years ago

stroykova commented 3 years ago

данные не собираются

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/usr/local/lib/python3.7/site-packages/scrapy/utils/defer.py", line 150, in f
    return deferred_from_coro(coro_f(*coro_args, **coro_kwargs))
  File "/code/newsbot/pipelines.py", line 36, in process_item
    dt = datetime.datetime.strptime(item["date"][0], spider.config.date_format)
  File "/usr/local/lib/python3.7/_strptime.py", line 577, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/local/lib/python3.7/_strptime.py", line 359, in _strptime
    (data_string, format))
ValueError: time data '' does not match format '%H:%M, %d %m %Y'
Avenon commented 3 years ago

Поправил: https://github.com/newsviz/Spiders/pull/21

Avenon commented 3 years ago

Файл с корректным парсингом. meduza.zip