Closed mehmetilker closed 5 years ago
as you mention, this issue is not related to news-please but due to a non successful installation of awscli, so I need to refer you to their corresponding installation sites or support. one note, though, is that this issue is not related to the python package, but to the awscli itself (which needs to be installed by itself, e.g., for ubuntu it would be apt install awscli). please look up the corresponding installation routine for windows. probably here: https://docs.aws.amazon.com/cli/latest/userguide/install-windows.html#install-msi-on-windows
ps: thanks for the issue, though! :-) i've added a brief explanation to the readme.md (additionally to the one contained in the example script, which - however - some user may not have seen) that awscli needs to be installed.
Thanks for the info. Problem wasn't abut awscli. I have installed and verified it. Exception was "/'awk' is not recognized as an internal or external command," awk is an default app in Linux I guess. So I have installed http://gnuwin32.sourceforge.net/packages/gawk.htm and add it to path but still getting "urllib.error.HTTPError: HTTP Error 505: HTTP Version not supported".
I thing problem is about url construction as some stated here. https://stackoverflow.com/questions/23715943/python-http-error-505-http-version-not-supported
So urllib construct url well in ubuntu but not in windows.
Another thing is about url is that it says "INFO:newsplease.crawler.commoncrawl_extractor:downloading https://commoncrawl.s3.amazonaws.com/awk: '{ (local: ./cc_download_warc/https%3A%2F%2Fcommoncrawl.s3.amazonaws.com%2Fawk%3A+%27%7B)"
I thinks commoncrawl.s3.amazonaws.com/awk is not right path. if awk is and app it shouldn't be here.
Describe the bug I have cloned repository and installed all the necessary libraries stated in requirements.txt and others like hurry after tried to run newsplease.examples.commoncrawl. Last error I got as follows:
I assume that it is the same problem specified here https://github.com/fhamborg/news-please/issues/36 and I have tried to install awscli but I got all "Requirement already satisfied" When I tried to run again I got same exception.
To Reproduce
Expected behavior Downloading news content from specified domain between specified date.
Versions (please complete the following information):