mediawiki-client-tools / mediawiki-dump-generator

Python 3 tools for downloading and preserving wikis
https://github.com/mediawiki-client-tools/mediawiki-scraper
GNU General Public License v3.0
89 stars 14 forks source link

Did not get a valid JSON response from the server. #107

Closed robkam closed 1 year ago

robkam commented 1 year ago

This wiki is a bit broken anyway.

$ dumpgenerator  --xml --xmlrevisions --images --api https://2020.opencircuits.com/api.php
Checking API... https://2020.opencircuits.com/api.php
API is OK: https://2020.opencircuits.com/api.php
Checking index.php... https://2020.opencircuits.com/index.php
index.php is OK
No --path argument provided. Defaulting to:
  [working_directory]/[domain_prefix]-[date]-wikidump
Which expands to:
  ./2020opencircuitscom-20230116-wikidump
--delay is the default value of 0.5
There will be a 0.5 second delay between HTTP calls in order to keep the server from timing you out.
If you know that this is unnecessary, you can manually specify '--delay 0.0'.
#########################################################################
# Welcome to DumpGenerator 0.4.0-alpha by WikiTeam (GPL v3)             #
# More info at: https://github.com/elsiehupp/wikiteam3                  #
#########################################################################

#########################################################################
# Copyright (C) 2011-2023 WikiTeam developers                           #
#                                                                       #
# This program is free software: you can redistribute it and/or modify  #
# it under the terms of the GNU General Public License as published by  #
# the Free Software Foundation, either version 3 of the License, or     #
# (at your option) any later version.                                   #
#                                                                       #
# This program is distributed in the hope that it will be useful,       #
# but WITHOUT ANY WARRANTY; without even the implied warranty of        #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the         #
# GNU General Public License for more details.                          #
#                                                                       #
# You should have received a copy of the GNU General Public License     #
# along with this program.  If not, see <http://www.gnu.org/licenses/>. #
#########################################################################

Analysing https://2020.opencircuits.com/api.php
Trying generating a new dump into a new directory...
https://2020.opencircuits.com/api.php
Getting the XML header from the API

Retrieving the XML for every page from the beginning

16 namespaces found
Trying to export all revisions from namespace 0
Trying to get wikitext from the allrevisions API and to build the XML
Traceback (most recent call last):
  File "C:\Python\Lib\site-packages\mwclient\client.py", line 441, in raw_api
    return json.loads(res, object_pairs_hook=OrderedDict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\Lib\json\__init__.py", line 359, in loads
    return cls(**kw).decode(s)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Python\Lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\Lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Python\Scripts\dumpgenerator.exe\__main__.py", line 7, in <module>
  File "C:\Python\Lib\site-packages\wikiteam3\dumpgenerator\__init__.py", line 26, in main
    DumpGenerator()
  File "C:\Python\Lib\site-packages\wikiteam3\dumpgenerator\dump\generator.py", line 115, in __init__
    DumpGenerator.createNewDump(config=config, other=other)
  File "C:\Python\Lib\site-packages\wikiteam3\dumpgenerator\dump\generator.py", line 128, in createNewDump
    generateXMLDump(config=config, session=other["session"])
  File "C:\Python\Lib\site-packages\wikiteam3\dumpgenerator\dump\xmldump\xml_dump.py", line 137, in generateXMLDump
    doXMLRevisionDump(config, session, xmlfile, lastPage, useAllrevisions=True)
  File "C:\Python\Lib\site-packages\wikiteam3\dumpgenerator\dump\xmldump\xml_dump.py", line 25, in doXMLRevisionDump
    for xml in getXMLRevisions(config=config, session=session, lastPage=lastPage, useAllrevision=useAllrevisions):
  File "C:\Python\Lib\site-packages\wikiteam3\dumpgenerator\dump\page\xmlrev\xml_revisions.py", line 56, in getXMLRevisionsByAllRevisions
    arvrequest = site.api(
                 ^^^^^^^^^
  File "C:\Python\Lib\site-packages\mwclient\client.py", line 285, in api
    info = self.raw_api(action, http_method, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\Lib\site-packages\mwclient\client.py", line 445, in raw_api
    raise errors.InvalidResponse(res)
mwclient.errors.InvalidResponse: Did not get a valid JSON response from the server. Check that you used the correct hostname. If you did, the server might be wrongly configured or experiencing temporary problems.
yzqzss commented 1 year ago

Use "http_method": "GET"