Closed ArvinZJC closed 5 years ago
Keyword extractor (V2.1.0.20191105): Code for avoiding Newspaper3k 403 Client Error for some URLs has been added and has passed an initial test.
Keyword extractor (V2.5.0.20191108): After adding a try...except...else statement and "stripping" each URL, newspaper3k 404 client error and some other connection errors have now been avoided.
Keyword extractor (V2.0.1.20191105): In order to analyse the news content to extract keywords, it needs downloading first. However, exception may be raised for some URLs because of the code 403 and 404. Handling is needed for these kinds of URLs.