shilad / wikibrain

The WikiBrain Java library enables researchers and developers to incorporate state-of-the-art Wikipedia-based algorithms and technologies in a few lines of code.
http://shilad.github.io/wikibrain/
Other
91 stars 55 forks source link

com.github.axet.wget.info.ex.DownloadMultipartError: Multipart error #253

Open cheetah90 opened 8 years ago

cheetah90 commented 8 years ago

Environment: GUIloader simple English h2 DB check Wikidata The following error occurred when downloading Wikidata.

0.34 Part#189(0.86) Part#190(0.62) Part#191(0.49) 0.35 Part#194(0.60) Part#195(0.48) Part#196(0.18) 0.35 Part#197(0.56) Part#198(0.80) Part#199(0.80) 0.36 Part#202(0.88) Part#203(0.32) Part#204(0.22) Exception in thread "main" com.github.axet.wget.info.ex.DownloadMultipartError: Multipart error at com.github.axet.wget.DirectMultipart.download(DirectMultipart.java:284) at com.github.axet.wget.WGet.download(WGet.java:162) at org.wikibrain.download.FileDownloader.download(FileDownloader.java:53) at org.wikibrain.wikidata.WikidataDumpLoader.main(WikidataDumpLoader.java:176) Stage wikidata failed with exit code 1

monkey2000 commented 8 years ago

I verified the doc in axet/wget, and I found that the downloader code had never handled DownloadMultipartError. Now I added the code. Can you reproduce this bug and give more detail?

cheetah90 commented 8 years ago

Hi @monkey2000, I ended up downloading the dumps directly from dumps.wikimedia.org. I think we should provide this as an alternative (at least in instructions).

Regarding the reproducing and testing, yes, I will but it will have to be in a later day - sorry I am on a deadline so will do this probably towards the end of this month. I will check back with you at that time. Sorry about that.

monkey2000 commented 8 years ago

@cheetah90 Nothing much!

shilad commented 8 years ago

@cheetah90 @monkey2000 Do you know if multipart support is necessary for downloads? If it's not I could publish a release with that change reverted.

monkey2000 commented 8 years ago

@shilad Clearly it plays a role.(at least 3× faster by my home network in Beijing, China) Due to the strange behavior, I think it isn't stable enough to release. So revert it might be a reliable choice.