okfn / docker-ckan

Docker images and Docker Compose setup for CKAN [Not Maintained]
GNU Affero General Public License v3.0
81 stars 88 forks source link

[ckan 2.8][py 2.7] Install ckanext-spatial and make ckanext-harvest working #84

Closed ccancellieri closed 2 years ago

ccancellieri commented 2 years ago

Hi I'm adding something I've had to fix to be able to install ckanext-plugin into docker-ckan.

  1. proj and pyproj can't be correctly installed (alpine 3.12) and this prevent the plugin to correctly run.
  2. The configured postgres db is not supporting postgis (requires docker db service changes)

1) has been solved downgrading some dependencies and ignoring: https://github.com/ckan/ckanext-spatial/blob/master/requirements-py2.txt

The main problem is that for some reason it's impossible to have a proj correctly installed on the machine, I tried several approaches up to build it from sources... (don't try to suggest easy solutions like apk add proj-util it simply won't solve)

The command which proj will always return none

Downgrading pyproj it's not complaining anymore about native bindings (probably it won't work at runtime but at least we can go ahead...)

Also shapely is not properly working in the suggested version so I downgraded it also.

I managed to install using the following:


cd ${APP_DIR}/src_extensions/ckanext-spatial/

 # shapely and pyproj are not supported 
    # Shapely>=1.2.13 \
    # pyproj==2.2.2 \

# python setup.py develop &&\
pip install -e "git+https://github.com/ckan/ckanext-spatial.git#egg=ckanext-spatial" &&\
apk add geos &&\
pip install \
    ckantoolkit \
    GeoAlchemy>=0.6 \
    GeoAlchemy2==0.5.0 \
    shapely==1.3.0 \
    pyproj==1.9.3 \
    OWSLib==0.18.0 \
    lxml>=2.3 \
    argparse \
    pyparsing>=2.1.10 \
    requests>=1.1.0 \
    six &&\

paster --plugin=ckan config-tool ${APP_DIR}/production.ini \
    "ckan.spatial.validator.profiles=iso19193eden" &&\
paster --plugin=ckan config-tool ${APP_DIR}/production.ini \
    "ckan.spatial.srid=4326" &&\
paster --plugin=ckan config-tool ${APP_DIR}/production.ini \
    "ckanext.spatial.search_backend=solr" &&\
paster --plugin=ckanext-spatial spatial initdb 4326 --config=$APP_DIR/production.ini

This can be called spatial.sh and placed under docker-entrypoint.d/

2) The db Docker file may use postgis:

# FROM postgres:11-alpine
FROM mdillon/postgis:11

Create the following file docker-entrypoint-initdb.d/30_postgis_permissions.sql:

CREATE EXTENSION POSTGIS;
ALTER VIEW geometry_columns OWNER TO ckan;
ALTER TABLE spatial_ref_sys OWNER TO ckan;

If for some reason the initialization of the db is not properly performed login on the db machine and run the above commands manually then rebuild the ckan-dev once again so the spatial initdb can do its job.

ccancellieri commented 2 years ago

Hi, I know this is a bit OT here but it's important to know that harvester can't work over Alpine 3.12 (default for ckan-base-2.8), the problem is related to lxml and libxml2. In short etree.find(...) returns a corrupted xml where the ending tag is not removed. Resulting in an xml like below:

<gmd:MD_Metadata xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gml="http://www.opengis.net/gml" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:geonet="http://www.fao.org/geonetwork" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xsi:schemaLocation="http://www.isotc211.org/2005/gmd https://www.isotc211.org/2005/gmd/gmd.xsd http://www.isotc211.org/2005/gmx https://www.isotc211.org/2005/gmx/gmx.xsd http://www.isotc211.org/2005/srv http://schemas.opengis.net/iso/19139/20060504/srv/srv.xsd">\n    
....
</gmd:MD_Metadata>\n
</csw:GetRecordByIdResponse>\n'

After getting crazy with libs and bindings for the whole day I decided to upgrade the Alpine version to 3.14 and it solved!

Now I'm able to harvest again an iso19115 and debug.