minimalparts / PeARS

Archive repository for the PeARS project. Please head over to https://github.com/PeARSearch/PeARS-orchard for the latest version.
MIT License
17 stars 21 forks source link

Adding helper(hack-y) scripts to compress and de-compress our db file #26

Closed nandajavarma closed 9 years ago

nandajavarma commented 9 years ago

Okay, Now this is a PR I myself am not sure about. I have added 2 scripts (mostly hacks) to the repo that will help compressing and uncompressing of wikiwoods easier. Since @minimalparts has a 200MB wikiwoods, I thought this could be of some help. Usage is pretty straight forward: To compress: ./compress_db wikiwoods.db This will create a file wikiwoods_dump.bz2 which is our compressed SQL dump.

To uncompress: ./uncompress_db wikiwoods.dump.bz2

Now comes the bad bit: The compression method used is bz2 which is relatively the fastest and most efficient. But still I am not sure how effective this will be when it comes to a 200MB file.

So the moral of the story is, @minimalparts , if you want to upload the wikiwoods.db as a smaller sizes and sqlite understand-able format you can make use of compress_db. To load back the compressed data you can use uncompress_db.

Not proud of this thing. But pushing in a quick hack. Will see if I can do a better job with it or wait for our solution architect( @stultus ) to suggest a better idea. ;)

P.S.: In some cases it looked like the zipped database dump had a mind of it's own. A couple of more hands trying it would be great.

nandajavarma commented 9 years ago

Fell free to abuse the scripts and close the PR if anyone feels like it :yum:

minimalparts commented 9 years ago

Thanks, @nandajavarma! I'll give it a go :-)