WikiTeam / wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2024, WikiTeam has preserved more than 600,000 wikis.
https://github.com/WikiTeam
GNU General Public License v3.0
714 stars 148 forks source link

uploader.py needs escaping #88

Open emijrp opened 10 years ago

emijrp commented 10 years ago

From nemow...@gmail.com on January 29, 2014 21:15:30

uploader.py just takes random regex-ed HTML (both from APIs and screenscraping) and then spits it out as shell commands: definitely not a good idea.

For proteopedia is also breaks with "sh: Syntax error: redirection unexpected", because one of their fields has single quotes:

curl --location --header 'x-amz-auto-make-bucket:1' --header 'x-archive-queue-derive:0' --header 'x-archive-size-hint:8634381217' --header 'authorization: ' --header 'x-archive-meta-mediatype:web' --header 'x-archive-meta-collection:wikiteam' --header 'x-archive-meta-title:Wiki - proteopediaorg_wiki' --header 'x-archive-meta-description:proteopediaorg_wiki dumped with WikiTeam tools.' --header 'x-archive-meta-language:English' --header 'x-archive-meta-last-updated-date:2014-01-28' --header 'x-archive-meta-subject:wiki; wikiteam; MediaWiki; proteopediaorg_wiki; proteopediaorg_wiki' --header 'x-archive-meta-licenseurl:/wiki/index.php/Proteopedia:Terms_of_Service' --header 'x-archive-meta-rights:User-added text is available under Proteopedia:Terms of Service and the CC-BY-SA 3.0 License.
Content aggregated by Proteopedia from external resources falls under the respective resources' copyrights. See the Terms of Service
' --header 'x-archive-meta-originalurl: http://www.proteopedia.org/wiki/index.php' --upload-file proteopediaorg_wiki-20140128-wikidump.7z http://s3.us.archive.org/wiki-proteopediaorg_wiki/proteopediaorg_wiki-20140128-wikidump.7z

Original issue: http://code.google.com/p/wikiteam/issues/detail?id=88

emijrp commented 10 years ago

From nemow...@gmail.com on January 29, 2014 13:27:14

Should be fixed by r935 .

Status: Fixed

emijrp commented 10 years ago

From nemow...@gmail.com on January 29, 2014 14:35:07

And, properly, by its followup r936 . Now more thoroughly tested and verified working (on linux).

emijrp commented 10 years ago

From nemow...@gmail.com on January 31, 2014 04:34:57

Reopening, I got a "sh: Syntax error: Unterminated quoted string" for http://www.farete.it/mediawiki/api.php . No time to investigate right now.

Status: Started

emijrp commented 10 years ago

From nemow...@gmail.com on February 26, 2014 15:29:58

Will probably be fixed by eliminating nasty bash commands, i.e. issue 54 .

Blockedon: wikiteam:54