[ted2zim::2024-03-02 08:22:20,385] DEBUG:extract_info_from_video_page: https://ted.com/talks/shawn_achor_the_happy_secret_to_better_work
[ted2zim::2024-03-02 08:22:22,109] ERROR:FAILED. An error occurred: 'videoData'
[ted2zim::2024-03-02 08:22:22,109] ERROR:'videoData'
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/ted2zim/entrypoint.py", line 193, in main
scraper.run()
File "/usr/local/lib/python3.11/site-packages/ted2zim/scraper.py", line 1064, in run
if not self.extract_videos_from_topics(topic):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/ted2zim/scraper.py", line 291, in extract_videos_from_topics
total_videos_scraped = self.generate_search_results(topic)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/ted2zim/scraper.py", line 258, in generate_search_results
) = self.extract_videos_in_search_results(result_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/ted2zim/scraper.py", line 410, in extract_videos_in_search_results
if self.extract_info_from_video_page(url):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/ted2zim/scraper.py", line 633, in extract_info_from_video_page
json_data = json.loads(
^^^^^^^^^^^
KeyError: 'videoData'
Problem not reproduced. I suggest to add try/except logic to at least log the HTML content we are trying to parse, so that we have more information next time. The server probably provided a weird content.
Recipe: https://farm.openzim.org/recipes/ted_topic_motivation Task: https://farm.openzim.org/pipeline/0b5f1ef7-11ff-42da-99b4-0a31d348ad26/debug