WikiTeam / wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2024, WikiTeam has preserved more than 600,000 wikis.
https://github.com/WikiTeam
GNU General Public License v3.0
721 stars 149 forks source link

dumpgenerator.py crashes on some images without higher resolution #63

Closed emijrp closed 10 years ago

emijrp commented 10 years ago

From lugu...@gmail.com on August 23, 2013 18:15:58

Running r829 . Successfully fully dumped 35 Wikia wikis until now, but got crashed in 3 more Wikia wikis and 1 non-Wikia MediaWiki wiki.

All related media files don't have higher resolutions, maybe this is the reason. List of media files: http://pt.saintseiya.wikia.com/wiki/Ficheiro:Atena.png http://pt-br.ben10.wikia.com/wiki/Arquivo:185px-Alien_Planta.jpg http://wikimerda.org/wiki/Arquivo:331px-Longcat.jpg See attachments for commands and log files

(phineaseferb_ptbr.sh didn't generate log files until now. I will update this report when/if it generates)

Attachment: ben10_ptbr.log ben10_ptbr.sh phineaseferb_ptbr.sh saintseiya_pt.log saintseiya_pt.sh wikimerda_org.log wikimerda_org.sh

Original issue: http://code.google.com/p/wikiteam/issues/detail?id=63

emijrp commented 10 years ago

From lugu...@gmail.com on August 23, 2013 16:01:58

phineaseferb_ptbr.sh finished successfully with no single change from my side (maybe at Wikia side? o.O )

Will try to run the anothers ones liste here again in some hours (currently my VPS is overloaded).

emijrp commented 10 years ago

From nemow...@gmail.com on September 09, 2013 05:45:05

Will try to run the anothers ones liste here again in some hours

Lugusto, did it work in the end?

emijrp commented 10 years ago

From lugu...@gmail.com on September 21, 2013 15:56:40

So sorry for the long delay. I'm running again those projects on a different VPS and all is working fine until now (wikimerda_org.sh finished with no errors), using r831

emijrp commented 10 years ago

From lugu...@gmail.com on September 22, 2013 18:06:17

Not fixed...

Checking api.php... http://pt.saintseiya.wikia.com/api.php api.php is OK Checking index.php... http://pt.saintseiya.wikia.com/index.php index.php is OK Analysing http://pt.saintseiya.wikia.com/api.php Loading config file... Resuming previous dump process... Title list was completed in the previous session XML dump was completed in the previous session Image list was completed in the previous session 3485 images were found in the directory from a previous session Retrieving images from "Atena.png" Traceback (most recent call last): File "./dumpgenerator.py", line 1161, in main() File "./dumpgenerator.py", line 1152, in main resumePreviousDump(config=config, other=other) File "./dumpgenerator.py", line 1068, in resumePreviousDump generateImageDump(config=config, other=other, images=images, start=lastfilename2) # we resume from previous image, which may be corrupted (or missing .desc) by the previous session ctrl-c or abort File "./dumpgenerator.py", line 652, in generateImageDump urllib.urlretrieve(url=url, filename='%s/%s' % (imagepath, filename2), data=urllib.urlencode({})) #fix, image request fails on wikipedia (POST neither works?) File "/usr/lib/python2.7/urllib.py", line 93, in urlretrieve return _urlopener.retrieve(url, filename, reporthook, data) File "/usr/lib/python2.7/urllib.py", line 243, in retrieve tfp = open(filename, 'wb') IOError: [Errno 2] No such file or directory: './ptsaintseiyawikiacom-20130921-wikidump/images/Atena/.jpg'

emijrp commented 10 years ago

From nemow...@gmail.com on October 25, 2013 05:43:50

Summary: dumpgenerator.py crashes on some images without higher resolution (was: dumpgenerator.py crashs on images without higher resolution)

emijrp commented 10 years ago

From nemow...@gmail.com on November 10, 2013 01:19:40

I tried this again... wikimerda worked for me too, but the others give "urllib2.HTTPError: HTTP Error 404: Not Found": can you please find an URL at which the index.php/api.php can be accessed?

emijrp commented 10 years ago

From lugu...@gmail.com on November 11, 2013 07:29:35

I will retry those from scratch in a few hours, with a possible delay of some additional hours to get the error again

emijrp commented 10 years ago

From nemow...@gmail.com on January 31, 2014 07:25:35

Of the attachments above, wikimerda_org.log was a network problem of the wiki; saintseiya_pt.log and ben10_ptbr.log both show errors on filenames with a slash: merging to issue 86 .

Status: Duplicate
Mergedinto: 86