Rostlab / JS16_ProjectF

In this project we will build a web portal for our GoT data analysis and visualization system. The website will integrate all the apps created in projects B-D with the help of the integration team assigned to Project E.
GNU General Public License v3.0
10 stars 8 forks source link

Crawler broken? #516

Closed yashha closed 8 years ago

yashha commented 8 years ago

I noticed for some time now that the crawler is not running. image

sacdallago commented 8 years ago

Also noticed.

@marcusnovotny @julienschmidt ?

marcusnovotny commented 8 years ago

Does this happen for all characters?

No idea what happened. Just restart the app

julienschmidt commented 8 years ago

Unfortunately both my backdoor and my crystal ball are broken 😉

sacdallago commented 8 years ago

So let me get another couple questions, maybe we can figure this out together :)

This is already something I don't quite understand... What do you need the CSVs for? Is it telling the app that they should look for the data on the database?

Now second thing is:

 db.charactersentiments.findOne()
{
    "_id" : ObjectId("570179d679429d1e795c2492"),
    "name" : "Robb Reyne",
    "slug" : "Robb_Reyne",
    "total" : 22,
    "positive" : 0,
    "negative" : 0,
    "popularity" : 0,
    "heat" : 22,
    "updated" : ISODate("2016-05-08T13:06:34.475Z")
}
> db.charactersentiments.findOne({"name":"Petyr Baelish"})
{
    "_id" : ObjectId("570179d679429d1e795c2a63"),
    "name" : "Petyr Baelish",
    "slug" : "Petyr_Baelish",
    "total" : 21599,
    "positive" : 7177,
    "negative" : 3349,
    "popularity" : 3828,
    "heat" : 21599,
    "updated" : ISODate("2016-05-12T08:51:49.124Z")
}

Any ideas?

julienschmidt commented 8 years ago

I didn't copy the CSV data (because I thought it would generate itself from the database) and just linked the DB.

Yes it should do so on the first start and on every update to that character afterwards.

This is already something I don't quite understand... What do you need the CSVs for? Is it telling the app that they should look for the data on the database?

Because aggregating the tweets on-access is not an option. It involves some heavy I/O in both the DB itself and transferring the tweets to the server for analysis. The CSVs are some kind of cached result. The server then just has to serve static files. Remember the difference between the cached and the non-cached API (Your LRZ guy showed my some graphs yesterday)? The difference here should be a few magnitudes larger 😉

My assumption is, that the crawler can not write the CSVs for some reason. Check file permissions, logs etc.

sacdallago commented 8 years ago

@julienschmidt checked :( Seems to be all working! Pffffff

Maybe it makes sense to run an instance locally with the same DB and at times copy over the CSVs?

yashha commented 8 years ago

status? @julienschmidt @sacdallago :)

sacdallago commented 8 years ago

Fixed (was not an ez-pz problem to solve)