Poking around to build my own parser, I followed the tips in README, and keep getting TypeErrors when I attempt to run the BBC parser on a story, and ditto for CNN. (Others seem fine, although Tagesschau returns no story URLs from test_parser.py tagesschau.TagesschauParser.)
For BBC, which is the one used in the README, I tried it with the URL from README and with a fresh URL fetched by test_parser.py bbc.BBCParser, same error either way:
ryantate@ryantate:~/dist/python/newsdiffs$ python parsers/test_parser.py bbc.BBCParser http://www.bbc.co.uk/news/uk-21649494
Traceback (most recent call last):
File "parsers/test_parser.py", line 29, in <module>
print unicode(parsed_article)
File "/home/ryantate/dist/python/newsdiffs/parsers/baseparser.py", line 138, in __unicode__
self.body,)))
TypeError: sequence item 0: expected string or Unicode, NoneType found
ryantate@ryantate:~/dist/python/newsdiffs$ python parsers/test_parser.py bbc.BBCParser http://www.bbc.co.uk/news/technology-34044506
Traceback (most recent call last):
File "parsers/test_parser.py", line 29, in <module>
print unicode(parsed_article)
File "/home/ryantate/dist/python/newsdiffs/parsers/baseparser.py", line 138, in __unicode__
self.body,)))
TypeError: sequence item 0: expected string or Unicode, NoneType found
ryantate@ryantate:~/dist/python/newsdiffs$
CNN:
ryantate@ryantate:~/dist/python/newsdiffs$ python parsers/test_parser.py cnn.CNNParser http://edition.cnn.com/2015/08/24/sport/vincenzo-nibali-tour-of-spain/index.html
Traceback (most recent call last):
File "parsers/test_parser.py", line 29, in <module>
print unicode(parsed_article)
File "/home/ryantate/dist/python/newsdiffs/parsers/baseparser.py", line 138, in __unicode__
self.body,)))
TypeError: sequence item 0: expected string or Unicode, NoneType found
Poking around to build my own parser, I followed the tips in README, and keep getting TypeErrors when I attempt to run the BBC parser on a story, and ditto for CNN. (Others seem fine, although Tagesschau returns no story URLs from test_parser.py tagesschau.TagesschauParser.)
For BBC, which is the one used in the README, I tried it with the URL from README and with a fresh URL fetched by test_parser.py bbc.BBCParser, same error either way:
CNN: