ckan / ckan

CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
https://ckan.org/
Other
4.49k stars 2k forks source link

Jetty9 / Ubuntu 18.04 #5368

Open wood-chris opened 4 years ago

wood-chris commented 4 years ago

CKAN Version if known (or site URL)

Problem seen with both CKAN 2.8.4 and 2.9

Installation instructions don't seem to work as expected in the environment I'm using

I've followed the instructions at https://docs.ckan.org/en/latest/maintaining/installing/install-from-source.html, but for both Python 2 and Python 3, on a totally fresh install of Ubuntu 18.04

Distributor ID: Ubuntu
Description:    Ubuntu 18.04 LTS
Release:    18.04
Codename:   bionic

As an aside, Section 4 says the command ckan generate config /etc/ckan/default/ckan.ini should be run. This results in:

$ ckan generate config /etc/ckan/default/ckan.ini

Command 'ckan' not found, did you mean:

  command 'ckon' from deb ckon
  command 'cpan' from deb perl

Try: sudo apt install <deb name>

Using the paster command in previous installation instructions does work.

Section 5 has in instruction for running on Ubuntu 18.04 - but when creating the softlink:

$ sudo ln -s /etc/solr/solr-jetty.xml /var/lib/jetty9/webapps/solr.xml
ln: failed to create symbolic link '/var/lib/jetty9/webapps/solr.xml': File exists

I changed it to -sf

I then followed the rest of the instructions. but this then results in a 404 for /solr:

$ curl http://localhost:8983/solr/
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /solr/. Reason:
<pre>    Not Found</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.15.v20190215</a><hr/>

</body>
</html>

The VM provider I'm using is going to drop remove U16 at its EOL (next March), so I would prefer to use U18 now, so I don't have to migrate!

wood-chris commented 4 years ago

Edited to clarify that it happens on Ubuntu 18.04 regardless of whether the python 2 (CKAN v2.8.4) or python 3 (CKAN v2.9) instructions are followed

wood-chris commented 4 years ago

Perhaps related enough to https://github.com/ckan/ckan/issues/4762 that the issues can be linked?

I can't use the solution there tho - I'm installing on an OpenStack VM with passwordless SSH access, so I can't authenticate when I run systemctl daemon-reload

wood-chris commented 4 years ago

And, just to add (should have said this originally!), I'm aware of previous issues raised (including where I've been a contributor!) but I've got pretty confused about all the options available, the exact versions of which packages need to be installed, and whether package manager versions or source versions should be installed!

https://github.com/ckan/ckan/issues/4916 https://github.com/ckan/ckan/issues/4762 https://github.com/ckan/ckan/wiki/Install-and-use-Solr-6.5-with-CKAN https://github.com/ckan/ideas-and-roadmap/issues/232

I'm trying to build a pretty robust script to install CKAN that I can distribute to project partners, ideally on U18, and some of the workarounds feel both a bit clunky and fragile (although perhaps that's because I don't fully understand SOLR / jetty in detail)

Zharktas commented 4 years ago

You shouldn't be using latest docs unless you specifically want to install latest master, there are numerous changes between 2.8 and the future 2.9 like replacing paster with new cli. There are bound to be errors in the latest docs since 2.9 is not done yet. But if you happen to find paster mentions in the latest docs, you should file an issue about them to get them fixed as 2.9 will not have paster anymore.

But the current solution for fixing broken solr-jetty is somewhat documented here https://stackoverflow.com/questions/55939999/how-to-get-solr-and-ckan-to-run-on-ubuntu-18-04-after-recent-solr-jetty-updates/56007895#56007895 which I've adapted to shell script here https://github.com/Zharktas/ckanext-report/blob/py3/bin/travis-build.bash#L74..L85.

There is a work in progress PR for supporting Solr 8.4, hopefully it would be finished before the release of 2.9 and we could get rid of the whole solr-jetty which is pretty unmaintained in Ubuntu repositories.

kowh-ai commented 4 years ago

Hi @wood-chris - I can confirm that Jetty9 on Ubuntu 18.04 works with the stackoverflow link @Zharktas mentions. Did it work for you? The CKAN 2.9 release (and docs) will hopefully clear up any confusion with (jetty/solr/OS) versions. Lets wait and see what the docs look like then

wood-chris commented 4 years ago

@kowh-ai @Zharktas

Sorry, I've only just got round to trying this again. No - the instructions in the Stackoverflow answer don't work for me - although let me know if I've made a mistake! :)

Details:

  1. I installed ckan, without datastore, in a fresh installation of Ubuntu:
ckan@ckan:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:    18.04
Codename:   bionic

At this point, running ckan did work, but as expected there were solr errors in the terminal output.

However, I did notice that the format of /etc/default/jetty9 was totally different. The only (uncommented) content in it was

# change to 1 to prevent Jetty from starting
NO_START=0

# change to 'no' or uncomment to use the default setting in /etc/default/rcS 
VERBOSE=yes

and I added in

JETTY_HOST=127.0.0.1
JETTY_PORT=8983
  1. /etc/systemd/system/jetty9.service.d created
(default) ckan@ckan:~$ ls -ld /etc/systemd/system/jetty9.service.d
drwxr-xr-x 2 root root 4096 May 29 17:50 /etc/systemd/system/jetty9.service.d

/etc/systemd/system/jetty9.service.d/solr.conf created with content added:

(default) ckan@ckan:~$ more /etc/systemd/system/jetty9.service.d/solr.conf 
[Service]
ReadWritePaths=/var/lib/solr
  1. /etc/solr/solr-jetty.xml updated
(default) ckan@ckan:~$ more /etc/solr/solr-jetty.xml
<?xml version="1.0"  encoding="ISO-8859-1"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">

<!-- Context configuration file for the Solr web application in Jetty -->

<Configure class="org.eclipse.jetty.webapp.WebAppContext">
  <Set name="contextPath">/solr</Set>
  <Set name="war">/usr/share/solr/web</Set>

  <!-- Set the solr.solr.home system property -->
  <Call name="setProperty" class="java.lang.System">
    <Arg type="String">solr.solr.home</Arg>
    <Arg type="String">/usr/share/solr</Arg>
  </Call>

  <!-- Enable symlinks -->
  <!-- Disabled due to being deprecated
  <Call name="addAliasCheck">
    <Arg>
      <New class="org.eclipse.jetty.server.handler.ContextHandler$ApproveSameSuffixAliases"/>
    </Arg>
  </Call>
  -->
</Configure>
  1. restart:
(default) ckan@ckan:~$ sudo systemctl daemon-reload
(default) ckan@ckan:~$ sudo service jetty9 restart
  1. jetty status:
(default) ckan@ckan:~$ sudo service jetty9 status 
● jetty9.service - Jetty 9 Web Application Server
   Loaded: loaded (/lib/systemd/system/jetty9.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/jetty9.service.d
           └─solr.conf
   Active: active (running) since Sat 2020-05-30 11:39:51 UTC; 30s ago
     Docs: https://www.eclipse.org/jetty/documentation/current/
 Main PID: 5385 (java)
    Tasks: 25 (limit: 1108)
   CGroup: /system.slice/jetty9.service
           └─5385 /usr/bin/java -Djetty.home=/usr/share/jetty9 -Djetty.base=/usr/share/jetty9 -Djava.io.tmpdir=/tmp -jar /usr/share/jetty9/start.jar jetty.state=/var/lib/jetty9/jetty.stat

May 30 11:39:56 ckan jetty9[5385]: May 30, 2020 11:39:56 AM org.apache.solr.core.SolrCore execute
May 30 11:39:56 ckan jetty9[5385]: INFO: [] webapp=null path=null params={q=static+firstSearcher+warming+in+solrconfig.xml&event=firstSearcher} hits=0 status=0 QTime=82
May 30 11:39:56 ckan jetty9[5385]: May 30, 2020 11:39:56 AM org.apache.solr.core.QuerySenderListener newSearcher
May 30 11:39:56 ckan jetty9[5385]: INFO: QuerySenderListener done.
May 30 11:39:56 ckan jetty9[5385]: May 30, 2020 11:39:56 AM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher
May 30 11:39:56 ckan jetty9[5385]: INFO: Loading spell index for spellchecker: default
May 30 11:39:56 ckan jetty9[5385]: May 30, 2020 11:39:56 AM org.apache.solr.core.SolrCore registerSearcher
May 30 11:39:56 ckan jetty9[5385]: INFO: [] Registered new searcher Searcher@781a9412 main
May 30 11:39:56 ckan jetty9[5385]: 2020-05-30 11:39:56.688:INFO:oejs.AbstractConnector:main: Started ServerConnector@5177ca74{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
May 30 11:39:56 ckan jetty9[5385]: 2020-05-30 11:39:56.689:INFO:oejs.Server:main: Started @4777ms

...and

(default) ckan@ckan:~$ curl http://localhost:8983/solr/
curl: (7) Failed to connect to localhost port 8983: Connection refused

Full output from restarting ckan:

(default) ckan@ckan:~$ paster serve /etc/ckan/default/development.ini
2020-05-30 11:51:38,276 INFO  [ckan.config.environment] Loading static files from public
2020-05-30 11:51:38,283 ERROR [pysolr] Failed to connect to server at 'http://127.0.0.1:8983/solr/select/?q=%2A%3A%2A&rows=1&wt=json', are you sure that URL is correct? Checking it in a browser might help: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ee1dcb5d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
Traceback (most recent call last):
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 366, in _send_request
    timeout=self.timeout)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 488, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 596, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/adapters.py", line 487, in send
    raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ee1dcb5d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
2020-05-30 11:51:38,290 ERROR [ckan.lib.search.common] Failed to connect to server at 'http://127.0.0.1:8983/solr/select/?q=%2A%3A%2A&rows=1&wt=json', are you sure that URL is correct? Checking it in a browser might help: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ee1dcb5d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
Traceback (most recent call last):
  File "/home/ckan/ckan/lib/default/src/ckan/ckan/lib/search/common.py", line 60, in is_available
    conn.search(q="*:*", rows=1)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 720, in search
    response = self._select(params, handler=search_handler)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 418, in _select
    return self._send_request('get', path)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 375, in _send_request
    raise SolrError(error_message % params)
SolrError: Failed to connect to server at 'http://127.0.0.1:8983/solr/select/?q=%2A%3A%2A&rows=1&wt=json', are you sure that URL is correct? Checking it in a browser might help: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ee1dcb5d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
2020-05-30 11:51:38,291 WARNI [ckan.lib.search] Problems were found while connecting to the SOLR server
2020-05-30 11:51:38,296 INFO  [ckan.config.environment] Loading templates from /home/ckan/ckan/lib/default/src/ckan/ckan/templates
2020-05-30 11:51:38,606 ERROR [pysolr] Failed to connect to server at 'http://127.0.0.1:8983/solr/select/?q=%2A%3A%2A&rows=1&wt=json', are you sure that URL is correct? Checking it in a browser might help: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ede5b6590>: Failed to establish a new connection: [Errno 111] Connection refused',))
Traceback (most recent call last):
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 366, in _send_request
    timeout=self.timeout)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 488, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/sessions.py", line 596, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/requests/adapters.py", line 487, in send
    raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ede5b6590>: Failed to establish a new connection: [Errno 111] Connection refused',))
2020-05-30 11:51:38,610 ERROR [ckan.lib.search.common] Failed to connect to server at 'http://127.0.0.1:8983/solr/select/?q=%2A%3A%2A&rows=1&wt=json', are you sure that URL is correct? Checking it in a browser might help: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ede5b6590>: Failed to establish a new connection: [Errno 111] Connection refused',))
Traceback (most recent call last):
  File "/home/ckan/ckan/lib/default/src/ckan/ckan/lib/search/common.py", line 60, in is_available
    conn.search(q="*:*", rows=1)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 720, in search
    response = self._select(params, handler=search_handler)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 418, in _select
    return self._send_request('get', path)
  File "/usr/lib/ckan/default/local/lib/python2.7/site-packages/pysolr.py", line 375, in _send_request
    raise SolrError(error_message % params)
SolrError: Failed to connect to server at 'http://127.0.0.1:8983/solr/select/?q=%2A%3A%2A&rows=1&wt=json', are you sure that URL is correct? Checking it in a browser might help: HTTPConnectionPool(host='127.0.0.1', port=8983): Max retries exceeded with url: /solr/select/?q=%2A%3A%2A&rows=1&wt=json (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7ede5b6590>: Failed to establish a new connection: [Errno 111] Connection refused',))
2020-05-30 11:51:38,610 WARNI [ckan.lib.search] Problems were found while connecting to the SOLR server
2020-05-30 11:51:38,615 INFO  [ckan.config.environment] Loading templates from /home/ckan/ckan/lib/default/src/ckan/ckan/templates
2020-05-30 11:51:38,700 CRITI [ckan.lib.uploader] Please specify a ckan.storage_path in your config
                         for your uploads
Starting server in PID 5512.
serving on 0.0.0.0:5000 view at http://127.0.0.1:5000
wood-chris commented 4 years ago

The fix

I've just found this issue: https://github.com/ckan/ckan/issues/4295

I didn't know about /etc/jetty9/start.ini, I've just looked and this is the default:

default) ckan@ckan:~$ more /etc/jetty9/start.ini 
#------------------------------------------------------------------------------
#
# Jetty Startup Configuration
#
# This file contains the default settings for Jetty and configures a basic
# Servlet container with JSP and WebSocket enabled. Customized settings can
# be added to .ini files in the /etc/jetty9/start.d directory to avoid
# conflicts when updating the package.
#
#------------------------------------------------------------------------------

--module=deploy,http,jsp,jstl,websocket,ext,resources

##
## HTTP Connector Configuration
##

# What host to listen on (leave commented to listen on all interfaces)
#jetty.host=myhost.com

# HTTP port to listen on
# Enable authbind in /etc/default/jetty9 to use a port lower than 1024
jetty.port=8080

# HTTP idle timeout in milliseconds
http.timeout=30000

##
## Server Threading Configuration
##

# minimum number of threads
threads.min=10

# maximum number of threads
threads.max=200

# thread idle timeout in milliseconds
threads.timeout=60000

I changed jetty.port=8080 to jetty.port=8983 and restarted.

Success!

(default) ckan@ckan:~$ curl http://localhost:8983/solr/ -Is
HTTP/1.1 200 OK
Content-Type: text/html
Set-Cookie: JSESSIONID=node0zruloi0vn4ruayd8i0cinfs92.node0;Path=/solr
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Length: 446
Server: Jetty(9.4.15.v20190215)

I'm not sure how @kowh-ai managed to get it working without editing this file? (and I can't see it mentioned in https://github.com/vrk-kpa/opendata/pull/506/files or https://github.com/ckan/ckan/issues/4762). Was there a change in a minor version upgrade in jetty9 about where the port should be specified?

kowh-ai commented 4 years ago

@wood-chris - oh yes I had to update this file for jetty to listen on 127.0.0.1:8983 - apologies if I didn't mention this!