openzim / mwoffliner

Mediawiki scraper: all your wiki articles in one highly compressed ZIM file
https://www.npmjs.com/package/mwoffliner
GNU General Public License v3.0
273 stars 72 forks source link

Bug with Paradox Wikis [EU4]: mwUrl [https://eu4.paradoxwikis.com/] is not valid. #1977

Open DataAddict42 opened 7 months ago

DataAddict42 commented 7 months ago

I was trying to download the paradox wiki for EU4, and I found that mwoffliner will not even start downloading it. It just says its an invalid URL.

Command I ran:

mwoffliner --mwUrl="https://eu4.paradoxwikis.com/" --adminEmail="foo@bar.net" --mwWikiPath="/Europa_Universalis_4_Wiki" --mwApiPath="/api.php" --verbose --speed="4" --resume --mwModulePath="/load.php" --format="nopic,novid" --osTmpDir="/tmp" --outputDirectory="/output" --webp --customZimDescription="Europa Universalis IV - Paradox Wikis" --mwRestApiPath="/wiki/" --requestTimeout="10000"

Terminal Output:

(node:2118) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023.

Please migrate your code to use AWS SDK for JavaScript (v3).
For more information, check the migration guide at https://a.co/7PzMCcy
(Use `node --trace-warnings ...` to show where the warning was created)
[error] [2024-01-21T21:37:46.813Z] Failed to run mwoffliner after [0s]: {
    "stack": "Error: mwUrl [https://eu4.paradoxwikis.com/] is not valid.\n    at file:///usr/lib/node_modules/mwoffliner/lib/sanitize-argument.js:103:15\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async sanitize_mwUrl (file:///usr/lib/node_modules/mwoffliner/lib/sanitize-argument.js:102:5)\n    at async sanitize_all (file:///usr/lib/node_modules/mwoffliner/lib/sanitize-argument.js:54:5)",
    "message": "mwUrl [https://eu4.paradoxwikis.com/] is not valid."
}
[error] [2024-01-21T21:37:46.813Z] 

**********

mwUrl [https://eu4.paradoxwikis.com/] is not valid.

**********

As I am new to using this program I am unsure whether this is caused by me not setting the correct parameters. However I believe it is a bug as I have not had this issue with other sites (UESP, Fandom wikis etc)

adilsyed518 commented 7 months ago

I have this problem with other sites as well. It seems ca certificate is not loaded for ssl. I am looking for a configuration option that will use the provided ca-certificate file so that it will allow ssl handshake. Please let me know if you find any?