Okay, Now this is a PR I myself am not sure about.
I have added 2 scripts (mostly hacks) to the repo that will help compressing and uncompressing of wikiwoods easier. Since @minimalparts has a 200MB wikiwoods, I thought this could be of some help.
Usage is pretty straight forward:
To compress:
./compress_db wikiwoods.db
This will create a file wikiwoods_dump.bz2 which is our compressed SQL dump.
To uncompress:
./uncompress_db wikiwoods.dump.bz2
Now comes the bad bit:
The compression method used is bz2 which is relatively the fastest and most efficient. But still I am not sure how effective this will be when it comes to a 200MB file.
So the moral of the story is, @minimalparts , if you want to upload the wikiwoods.db as a smaller sizes and sqlite understand-able format you can make use of compress_db.
To load back the compressed data you can use uncompress_db.
Not proud of this thing. But pushing in a quick hack. Will see if I can do a better job with it or wait for our solution architect( @stultus ) to suggest a better idea. ;)
P.S.: In some cases it looked like the zipped database dump had a mind of it's own. A couple of more hands trying it would be great.
Okay, Now this is a PR I myself am not sure about. I have added 2 scripts (mostly hacks) to the repo that will help compressing and uncompressing of wikiwoods easier. Since @minimalparts has a 200MB wikiwoods, I thought this could be of some help. Usage is pretty straight forward: To compress:
./compress_db wikiwoods.db
This will create a filewikiwoods_dump.bz2
which is our compressed SQL dump.To uncompress:
./uncompress_db wikiwoods.dump.bz2
Now comes the bad bit: The compression method used is bz2 which is relatively the fastest and most efficient. But still I am not sure how effective this will be when it comes to a 200MB file.
So the moral of the story is, @minimalparts , if you want to upload the wikiwoods.db as a smaller sizes and sqlite understand-able format you can make use of compress_db. To load back the compressed data you can use uncompress_db.
Not proud of this thing. But pushing in a quick hack. Will see if I can do a better job with it or wait for our solution architect( @stultus ) to suggest a better idea. ;)
P.S.: In some cases it looked like the zipped database dump had a mind of it's own. A couple of more hands trying it would be great.