stephen304 / bitcannon

A torrent index archiving, browsing, and backup tool
http://bitcannon.io/
MIT License
408 stars 40 forks source link

Slow import towards of the gz file #81

Closed OmgImAlexis closed 9 years ago

OmgImAlexis commented 9 years ago

I tried importing E002 from bitsnoop, it imported the first 1,347,223 fine at a rate of 1,000 - 5,000 a second but torrents after that loaded extremely slow at a rate of about 100 a second. I've using a Macbook Pro 2010 with 8GB RAM and a 265GB SSD, I also had about 4GB of free RAM and over 100GB of free disk space so I don't think I was hitting any IO limits.

➜  bitcannon  ./bitcannon_darwin_amd64 ../b3_all.txt.gz 
2015/10/11 15:54:24 [OK!] Connecting to Mongo at 127.0.0.1
2015/10/11 15:54:24 [OK!] Attempting to parse 
2015/10/11 15:54:24 ../b3_all.txt.gz
2015/10/11 15:54:24 [OK!] File opened
2015/10/11 15:54:24 [OK!] Extension is valid
2015/10/11 15:54:24 [OK!] GZip detected, unzipping enabled
2015/10/11 15:54:24 [OK!] Reading initialized
2015/10/11 16:25:16 [OK!] Reading completed
2015/10/11 16:25:16       1646925 torrents imported
2015/10/11 16:25:16       21386348 torrents skipped
2015/10/11 16:25:16 

Press enter to quit...
stephen304 commented 9 years ago

I'm pretty sure this is just due to mongodb indexing. Indexing is necessary to have search queries run in under a couple seconds instead of 2-3 minutes (Yes, it actually took that long before I enabled indexing.)

I don't think there's much that can be done. At the beginning mongodb doesn't have much indexing work to do, but it's probably piling up as it goes along.

OmgImAlexis commented 9 years ago

That's fine, just thought I'd report it incase there was an easy fix. Shall I close this?