Open psivesely opened 8 years ago
I'm not sure the last time the features were computed (I can't SSH into the VPSs right now for some reason--probably IPTables?), but anyway it seems like this is definitely needed:
fpsd=> select * from features.cell_timings order by total_elapsed_time desc limit 3;
exampleid | total_elapsed_time
-----------+--------------------
1106 | 930.656736
7567 | 449.786331
1387 | 441.871928
(3 rows)
Selenium's page load timeout function is highly unreliable. If it doesn't close down a connection within 5s of when it's supposed to, we should stop a crawl by whatever means necessary (probably closing all circuits will be sufficient, but we already have a method for restarting TB if we need to). This will stop the crawler from wasting time getting stuck on these sites which load for minutes at a time. See
fpsd/tests/test_sketchy_sites.py
for some good example sites/ a good test case for this timeout function.