VIDA-NYU / domain_discovery_tool_deprecated

Seed acquisition tool to bootstrap focused crawlers
23 stars 8 forks source link

Group by correlation does not work #86

Closed aecio closed 8 years ago

aecio commented 8 years ago

Request to /getPages fails with following exception due to hard-coded file name.

2016-08-16 20:07:48,524 : ERROR : [16/Aug/2016:20:07:48] HTTP 
Traceback (most recent call last):
  File "~/.anaconda2/envs/ddt/lib/python2.7/site-packages/cherrypy/_cprequest.py", line 670, in respond
    response.body = self.handler()
  File "~/.anaconda2/envs/ddt/lib/python2.7/site-packages/cherrypy/lib/encoding.py", line 217, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "~/.anaconda2/envs/ddt/lib/python2.7/site-packages/cherrypy/_cpdispatch.py", line 61, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "~/workspace/ddt/vis/server.py", line 274, in getPages
    data = self._crawler.getPages(session)
  File "~/workspace/ddt/vis/crawler_model_adapter.py", line 99, in getPages
    return self._crawlerModel.getPages(session)
  File "~/workspace/ddt/models/crawlermodel.py", line 996, in getPages
    return self.generatePagesProjection(hits, session)
  File "~/workspace/ddt/models/crawlermodel.py", line 1031, in generatePagesProjection
    projectionData = self.projectPages(docs, session['activeProjectionAlg'], es_info=es_info)
  File "~/workspace/ddt/models/crawlermodel.py", line 1635, in projectPages
    return self.projectionsAlg[projectionType](pages, es_info)
  File "~/workspace/ddt/models/crawlermodel.py", line 1684, in tsne
    tsnedata = CrawlerModel.runTSNESKLearn(1-data, labels, tsne_count)
  File "~/workspace/ddt/models/crawlermodel.py", line 1751, in runTSNESKLearn
    joblib.dump([result,y], '/media/data/yamuna/Memex/scripts/tsne/ddt_tsne_proj.pkl')
  File "~/.anaconda2/envs/ddt/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 402, in dump
    cache_size=cache_size, protocol=protocol)
  File "~/.anaconda2/envs/ddt/lib/python2.7/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 209, in __init__
    self.file = open(filename, 'wb')
IOError: [Errno 2] No such file or directory: '/media/data/yamuna/Memex/scripts/tsne/ddt_tsne_proj.pkl'
yamsgithub commented 8 years ago

Should be fixed with the latest update