geopython / GeoHealthCheck

Service Status and QoS Checker for OGC Web Services
https://geohealthcheck.org
MIT License
84 stars 71 forks source link

Python 3 updates #280

Closed borrob closed 4 years ago

borrob commented 5 years ago

Please do NOT approve this pull request. This is Work In Progress for the python3 conversion. I have an issue with one specific test and we want to see if Travis gets the same error or if it passes without problems.

justb4 commented 5 years ago

The Travis failure is only a flake8 error.

borrob commented 5 years ago

:) yes, so the normal testing passes, so I guess it is something with my setup that the one test fails on my machine.

I haven't yet had much time to work on the docker issue, but otherwise I would say we are almost there.

justb4 commented 5 years ago

I can helpout with any Docker issues (what was the problem?).

borrob commented 5 years ago

I have been able to track the error, but haven't yet able to solve it. I have updated the Dockerfile to use python3.7.4-alpine3.10 as a starting image. Somewhere in the building process install.sh installs the python dependencies from the requirements.txt and in there: pyproj causes an error:

Proj executable not found. Please set PROJ_DIR variable.

I tried adding proj to the apk installation in the Dockerfile but to no avail. An alternative solution that I found on the web is to include the python-dev library, but that gets the development files for python 3.7.3, which make me think the virtual env with python 3.7.4 still cannot use it (and I did some other testing suggesting that solely adding python-dev is not enough).

Any ideas or suggestions are welcome. Unfortunately I won't be able to work on this over the weekend.

justb4 commented 5 years ago

Docker: Think Alpine images are problematic with binaries like GDAL and Proj. May end up doing compiles, or have things improved recently? Debian Slim is a better option. As with Stetl py3 migration we end up using python:3.6-slim-stretch. And py 3.6 i.s.o. 3.7, can't recall why. (also busy w other stuff over weekend further...)

justb4 commented 5 years ago

Sorry, forget my comment on Alpine: was mistaken with other projects (Stetl, pygeoapi, workshop) where GDAL was used. Alpine was already in use with GHC for Python 2.7. I can look into the Docker build further.

justb4 commented 5 years ago

Globally testing @borrob py3 branch on 24.09.2019:

2019-09-24 12:39:33,105 - healthcheck - ERROR - Getting metadata failed: 'str' object has no attribute 'decode'
Traceback (most recent call last):
  File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/healthcheck.py", line 247, in sniff_test_resource
    title = title.decode('utf-8')
AttributeError: 'str' object has no attribute 'decode'

Removing this line title = title.decode('utf-8') fixes that. (all str is unicode in py3 IMHO).

justb4 commented 5 years ago

Looked into the pyproj Docker build failure:

borrob commented 5 years ago

Thanks. I think we are actually almost there. From the top of my head:

borrob commented 5 years ago

Docker seems to be working now (by downgrading pyproj), no flake8 issues anymore, all tests are passing, and I merged to latest master branch.

I'll start with reviewing the documentation to see if that is in line with py3.

justb4 commented 5 years ago

Started testing locally with py3 branch, will add findings in comments:

1. all tests run OK

2. filter() returns map not list in Py3

4. Basic Auth Fails

justb4 commented 5 years ago

Tested after pulling above commits.

5 - HttpStatusNoError Never detects errors

Was testing Basic Auth. Curious why even without credentials HTTPGet Probe was successful. We trapped into the div issue from Python2 to Python3! Problem is this line in checks.py:

    def perform(self):
        """Default check: Resource should at least give no error"""
        status = self.probe.response.status_code
==>     overall_status = status / 100
        if overall_status in [4, 5]:
            self.set_result(False, 'HTTP Error status=%d' % status)

In Python2 gives floor value, but Python3 a float. Fix needs to be: overall_status = status // 100 AFAICS. (Now Basic Auth always fails, even with right credentials, but is other (encoding?) issue...)

ad 4 - Basic Auth encoding gives wrong HTTP header value

For a correctly encoded header string like 'Basic aWF...XS1\n' this line gives something like 'Basic b\'aWF...XS1\\n\'' (the \n is later always stripped).

justb4 commented 5 years ago

Ok, tested again: 3 (OWSLib TMS) and 5 (HTTP status check) above found solved. 4 (auth encoding) only with SQLite as DB.

6 - Auth info Decoding/Encoding Failure with Postgres

Only occurs when using Postgres, not with SQLite. Background: GHC encodes/encrypts a generic auth dict info structure via JSON string to be stored as textfield in DB. It decodes/decrypts when reading. This way we can support multiple auth types with a single auth column in resource table. So this is another encoding than in 4 for HTTP auth headers, but think similar problem. We also need to deal with existing PG DBs that have auth columns already present in resource table.

Analysis: problem appears in decode() (and probably encode() as well) in util.py :

def decode(key, string):
    string = base64.urlsafe_b64decode(string + b'===')
    string = string.decode('latin') if six.PY3 else string
    encoded_chars = []
    for i in range(len(string)):
        key_c = key[i % len(key)]
        encoded_c = chr((ord(string[i]) - ord(key_c) + 256) % 256)
        encoded_chars.append(encoded_c)
    encoded_string = ''.join(encoded_chars)
    return encoded_string

With SQLite string is of type bytes but with PG of type string. Tried something like:

def decode(key, string):
    if type(string) is not bytes:
        string = string.encode()
    string = base64.urlsafe_b64decode(string + b'===')

But then get other error on Edit and Test:

File "/Users/just/project/geohealthcheck/borrob.git/GeoHealthCheck/util.py", line 247, in decode
string = base64.urlsafe_b64decode(string + b'===')
File "/Users/just/.pyenv/versions/3.7.1/lib/python3.7/base64.py", line 133, in urlsafe_b64decode
return b64decode(s)
File "/Users/just/.pyenv/versions/3.7.1/lib/python3.7/base64.py", line 87, in b64decode
return binascii.a2b_base64(s)
binascii.Error: Invalid base64-encoded string: number of data characters (213) cannot be 1 more than a multiple of 4
borrob commented 5 years ago

Weird... I would expect SQLAlchemy to deal with the abstraction and get the string/byte conversion right regardless of which database is used. I will look into it.

Agree: we should be able to drop six.

borrob commented 5 years ago

We're not there yet: I noticed some paver commands still need an update because of issues with the importing of modules. Also: the docker image is building, but I haven't checked yet if it is actually working. I had to change the gunicorn configuration on a non-docker deploy.

justb4 commented 5 years ago

@borrob good to see you synced with master! Will try not to do too many big changes.

The Postgres-char-issue: my bad! For security reasons the SECRET_KEY is used for encoding/decoding stored auth creds, and the Py3 instance had a different key! Using the same key: no problem. Pff, was thinking that we had a very serious encoding issue with difficult DB migrations. So we're getting closer!

borrob commented 5 years ago

I fixed one paver issue and did some testing (also with docker). I think we're good to go and I'm curious what the results of the demo environment will be.

borrob commented 5 years ago

I removed the draft tag from this pull request (that was there for the automated testing with travis). Please review and let's hope we can move to py3 soon!

justb4 commented 5 years ago

Yes, good, I only want to release 0.7.0 from current master first, and create a 07.0 maintenance branch. Then further test in particular with Docker and then merge this PR and have a quite some testing time on demo site. OK, @tomkralidis ? Let's aim for all of this before nov 1 ok?

NB solved a nasty concurrency bug (two lines) with #301 #302 today.

justb4 commented 4 years ago

A great thank-you @borrob! This was no easy PR to do.

tomkralidis commented 4 years ago

Thanks for the awesome contribution @borrob!