Closed sacdallago closed 8 years ago
@sacdallago the cause seems to be Buffer.from()
method. It was introduced in node js 6.0.0 and were running 4.0.0. So either i Change it to new Buffer()
or we update node js.
Yes there is new data. About 1.1 million sightings, around 350 MB ... so hitting the limit soon :smile:
570 MB as of now ... 500 MB was the limit right?
oh shit :D @gyachdav @goldbergtatyana @juanmirocks can someone with more authority than I ping Tim on the mongo issue?
@sacdallago I will ask
What size limit do we expect for the mongodb?
thanks @juanmirocks , also @goldbergtatyana will participate in the quest :)
I asked for a 500GB instance, but considering the growth of this db and the size of the old one, I would almost be tempted to go for a 1TB, if Tim has space left somewhere!
@PokemonGoers/pokedata please fix #143 and #146 ASAP
@sacdallago it seems to me somehow the new twitter pokemon sightings data is not added to the pokemonsightings collection, is the twitter credentials added somewhere? config file or so? compared to other sources, twitter requires the credentails to be added as well
MLAB_USERNAME=<MLAB_USERNAME> MLAB_PASSWORD=<MLAB_PASSWORD> MLAB_URI=<MLAB_URI> MLAB_COLLECTION=<MLAB_COLLECTION> CONSUMER_KEY=<CONSUMER_KEY> CONSUMER_SECRET=<CONSUMER_SECRET> ACCESS_TOKEN=<ACCESS_TOKEN> ACCESS_TOKEN_SECRET=<ACCESS_TOKEN_SECRET> NODE_ENV=<NODE_ENV> npm run listen -collection=twitter
@samitsv twitter listener is indeed running and is indeed running with the env variables. This needs some further digging. Although: we maxed out the mlba space, right? I got the mongo instance from the lab today but there is another issue to solve first, otherwise it won't work.. might need some time still, which is bad.
Running manually on RostLab now:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ea097ae5da09 pokemongoers/pokedata "npm run listen -coll" 18 seconds ago Up 17 seconds 8080/tcp determined_payne
f093ddaf3d93 pokemongoers/pokedata "npm run listen -coll" 19 seconds ago Up 18 seconds 8080/tcp happy_brown
1a59e093e4fc pokemongoers/pokedata "npm run listen -coll" 20 seconds ago Up 18 seconds 8080/tcp zen_easley
25f585ba9e6b pokemongoers/pokedata "npm run listen -coll" 21 seconds ago Up 19 seconds 8080/tcp cocky_shannon
3f8ab8191c86 pokemongoers/pokedata "npm run listen -coll" 21 seconds ago Up 20 seconds 8080/tcp gigantic_jang
fe7a5eb4d7b0 pokemongoers/pokedata "npm run listen -coll" 22 seconds ago Up 21 seconds 8080/tcp big_chandrasekhar
663b7c4ecd54 pokemongoers/pokedata "npm run listen -coll" 23 seconds ago Up 21 seconds 8080/tcp agitated_mcclintock
5e2dd9f8c2c4 pokemongoers/pokedata "npm run listen -coll" 24 seconds ago Up 22 seconds 8080/tcp serene_curie
f64f040c5430 pokemongoers/pokedata "npm run listen -coll" 24 seconds ago Up 23 seconds 8080/tcp drunk_leakey
the database is populating fast (after 5 minutes)
> show dbs
local 0.078GB
pokemongo 0.203GB
1 second apart calls of count()
:
> db.pokemonsightings.count()
43635
> db.pokemonsightings.count()
43726
> db.pokemonsightings.count()
43858
> db.pokemonsightings.count()
43859
> db.pokemonsightings.count()
43909
If nothing goes wrong, we will have some nice data for the kaggle @gyachdav @goldbergtatyana Leaving for a 32h-awake trip now 😪 🌴 have fun guys!
@sacdallago hows the DB doing 😄 ?
It took me about 10 minutes to type this via ssh 😷 Bali is good for surfing but not on the web!
> show dbs
local 0.078GB
pokemongo 5.951GB
> db.pokemonsightings.count()
6858914
not bad for three days, guys!
@goldbergtatyana @gyachdav this data should make it to the kaggle. I started a data dump now on the external drive connected to the old got virtual machine, from where somehow I should be able to copy it to somewhere else in rostlab or expose it on the web (or you can ask Tim if he unplugs the hard drive from the VM and you copy the data the old fashioned way).
Good news everyone
The log with pokemap is:
Could someone look into that?
Still everything is running on my server. But at least the endpoint will not change now and we are running the listeners for some time.
Please also let me know if you actually see the new data in the database, logging on production is not very exhaustive.