PyAr / CDPedia

CDPedia is a project to make the Wikipedia accesable offline
34 stars 15 forks source link

fail to get list file: forbidden #399

Open FPSensor opened 1 year ago

FPSensor commented 1 year ago

2022-11-28 20:08:20,350 cdpetron INFO Opened succesfully image type config file 'imagtypes.yaml' 2022-11-28 20:08:20,355 cdpetron INFO Opened succesfully language config file 'languages.yaml' 2022-11-28 20:08:20,355 cdpetron INFO Dump directory: '/home/flash/CDPedia/out' 2022-11-28 20:08:20,355 cdpetron INFO Generating for language: 'es' 2022-11-28 20:08:20,355 cdpetron INFO Language config: {'include': ['Wikipedia:Acerca_de', 'Wikipedia:Limitación_general_de_responsabilidad', 'Wikipedia:Aviso_de_riesgo', 'Wikipedia:Aviso_médico', 'Wikipedia:Aviso_legal', 'Wikipedia:Aviso_de_contenido', 'Wikipedia:Derechos_de_autor', 'Wikipedia:Portal'], 'portal_index': 'Portal:Portada', 'python_docs': 'https://docs.python.org/es/3.8/archives/python-3.8.5-docs-html.tar.bz2', 'language_name': {'es': 'Castellano', 'en': 'Spanish'}} 2022-11-28 20:08:20,356 cdpetron INFO Options: nolists=False noscrap=False noclean=False test=False 2022-11-28 20:08:20,356 cdpetron INFO Date of generation saved: 20221128 2022-11-28 20:08:20,356 cdpetron INFO Getting list file: 'http://dumps.wikimedia.org/eswiki/latest/eswiki-latest-all-titles-in-ns0.gz' Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/flash/CDPedia/cdpetron.py", line 472, in main( File "/home/flash/CDPedia/cdpetron.py", line 331, in main gendate = get_lists(language, lang_config, test) File "/home/flash/CDPedia/cdpetron.py", line 143, in get_lists u = urllib.request.urlopen(url) File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.10/urllib/request.py", line 525, in open response = meth(req, response) File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response response = self.parent.error( File "/usr/lib/python3.10/urllib/request.py", line 563, in error return self._call_chain(args) File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain result = func(args) File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

facundobatista commented 1 year ago

Hello! If I click in this url, I can download the file ok 'http://dumps.wikimedia.org/eswiki/latest/eswiki-latest-all-titles-in-ns0.gz'... is this still failing to you? Was something transient? Thanks!