medialab / hyphe

Websites crawler with built-in exploration and control web interface
http://hyphe.medialab.sciences-po.fr/demo/
GNU Affero General Public License v3.0
329 stars 59 forks source link

Ubuntu install instructions #64

Closed robhammond closed 10 years ago

robhammond commented 11 years ago

Have tried to install several times on different clean Ubuntu servers, none successfully.

I needed to run sudo apt-get install maven, sudo apt-get install build-essential which isn't covered in install instructions, in order to get the git clone working.

After that I get the web interface & lucene instance running ok, but get JSON errors on AJAX connections to localhost:6978 when trying to save any data, can't figure that one out.

bornakke commented 11 years ago

How have you configured hyphe_www_client/_config/config.js?

robhammond commented 11 years ago

Contents are as per default:

HYPHE_CONFIG = { "SERVER_ADDRESS":"http://localhost:6978", "JAVASCRIPT_LOG_VERBOSE":true }

The specific error I get when trying to connect to port 6978 is "exceptions.ValueError: No JSON object could be decoded".

On the web interface I see this red popup every few seconds "Oops - something failed when communicating with the server - Aborted"

bornakke commented 11 years ago

Is it running on a local host? And have you checked if port 6978 is open?

On 27 Nov 2013, at 11:18 , Rob Hammond notifications@github.com wrote:

Contents are as per default:

HYPHE_CONFIG = { "SERVER_ADDRESS":"http://localhost:6978", "JAVASCRIPT_LOG_VERBOSE":true }

The specific error I get when trying to connect to port 6978 is "exceptions.ValueError: No JSON object could be decoded".

On the web interface I see this red popup every few seconds "Oops - something failed when communicating with the server - Aborted"

— Reply to this email directly or view it on GitHub.

robhammond commented 11 years ago

Yes & yes, can connect to port 6978 no problem

bornakke commented 11 years ago

Hmm im running out of good sugestions then :( Den 27/11/2013 11.35 skrev "Rob Hammond" notifications@github.com:

Yes & yes, can connect to port 6978 no problem

— Reply to this email directly or view it on GitHubhttps://github.com/medialab/Hypertext-Corpus-Initiative/issues/64#issuecomment-29374451 .

boogheta commented 11 years ago

Hi there, and thanks to @bornakke for the help answering :)

@robhammond Indeed we forgot to add maven in the list of required package for instalaltion, this will soon be fixed, we're currently working on a better install script for both centos and debian.

Regarding the connection this seems strange, can you dump a bit of the content of log/hyphe-core.log ?

Do you get anything on http://localhost:6978 in a browser ?

Also can you try the following and give me the output

source $(which virtualenvwrapper.sh)
workon HCI
./hyphe_backend/test_client.py get_status
deactivate

Thanks

robhammond commented 11 years ago

Thanks for your reply - output is as follows:

CALL: get_status 
{u'code': u'success',
 u'result': {u'crawler': {u'jobs_pending': 0,
                          u'jobs_running': 0,
                          u'links_found': 0,
                          u'pages_crawled': 0,
                          u'pages_found': 0},
             u'memory_structure': {u'job_running': None,
                                   u'job_running_since': 1386164002232.344,
                                   u'last_index': 1386164002232.389,
                                   u'last_links_generation': 1386164002232.366,
                                   u'pages_to_index': 0,
                                   u'webentities': 0}}}

I don't appear to have a log/hyphe-core.log file - just log/hyphe-memorystructure.log.

This is the output of the 500 error I get trying to connect to localhost:6978:

web.Server Traceback (most recent call last):

   exceptions.ValueError: No JSON object could be decoded
   /home/vagrant/.virtualenvs/HCI/local/lib/python2.7/site-packages/twisted/web/server.py:189 in process
   188                    self._encoder = encoder
   189            self.render(resrc)
   190        except:
   /home/vagrant/.virtualenvs/HCI/local/lib/python2.7/site-packages/twisted/web/server.py:238 in render
   237        try:
   238            body = resrc.render(self)
   239        except UnsupportedMethod as e:
   hyphe_backend/core.tac:88 in render
   87            print "QUERY%s: %s" % (from_ip, request.content.read())
   88        return jsonrpc.JSONRPC.render(self, request)
   89
   /home/vagrant/.virtualenvs/HCI/local/lib/python2.7/site-packages/txjsonrpc/web/jsonrpc.py:91 in render
   90        content = request.content.read()
   91        parsed = jsonrpclib.loads(content)
   92        functionPath = parsed.get("method")
   /home/vagrant/.virtualenvs/HCI/local/lib/python2.7/site-packages/txjsonrpc/jsonrpclib.py:80 in loads
   79 def loads(string, **kws):
   80    unmarshalled = json.loads(string, **kws)
   81    # XXX there's going to need to be some version-conditional code here...
   /usr/lib/python2.7/json/__init__.py:326 in loads
   325            parse_constant is None and object_pairs_hook is None and not kw):
   326        return _default_decoder.decode(s)
   327    if cls is None:
   /usr/lib/python2.7/json/decoder.py:366 in decode
   365        """
   366        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
   367        end = _w(s, end).end()
   /usr/lib/python2.7/json/decoder.py:384 in raw_decode
   383        except StopIteration:
   384            raise ValueError("No JSON object could be decoded")
   385        return obj, end
   exceptions.ValueError: No JSON object could be decoded
boogheta commented 11 years ago

All right, so except for log/hyphe-core.log missing, everything seems normal (this is the normal output of the JSON-RPC API when not sending it anything like in browser).

Did you start with the regular bin/hyphe start command ?

Are you trying to access http://localhost/hyphe from the same machine?

boogheta commented 11 years ago

PS: when wanting to install on a server for remote access, you need to set the corresponding domain-name instead of localhost in hyphe_www_client/_config/config.js 's SERVER_ADDRESS

robhammond commented 11 years ago

I've started using 2 commands: bash bin/start_standalone_lucene.sh then bach bin/start_standalone_core.sh

Running on a local (vagrant) server instance - no GUI on the virtual box, but with pipe from OSX to localhost on the Ubuntu vagrant box, so should work ok in theory as it's also localhost.

boogheta commented 11 years ago

Sorry these are deprecated, you should stop the instances:

ps -ef | grep "(java|twistd)"
and kill the two processes (not the scrapyd one) Then restart using the regular starter:
bin/hyphe start
(then you can use stop or restart instead of start with the same command) and try again.

Please let me know then.

boogheta commented 6 years ago

Hi @robhammond, FYI, we finally released a new version with a more generic installation process which should allow you to easily install Hyphe now.