azavea / pfb-network-connectivity

PFB Bicycle Network Connectivity
Other
39 stars 10 forks source link

Upgrade Django services to Python 3 #757

Closed KlaasH closed 5 years ago

KlaasH commented 5 years ago

Overview

Upgrades the django and django-q services to Python 3. Specifically 3.6, since that's the default on the new vagrant base image I picked and the highest available version for the base container we're using.

I used futurize to make most of the changes, but with a few tweaks to make the code cleaner at the cost of backward compatibility. Mainly I removed all the places where classes inherited from (object), rather than adding new ones like futurize would do by default. Since this isn't a library, backward compatibility isn't a priority. There are some bits that aren't fully migrated, though, like from past.builtins import basestring in models.py, because they work fine, they don't really make the code any messier than it would be with a py3-only implementation, and it's easier to leave them than to refactor them.

There was one incompatibility (so far) that futurize didn't catch--testing isinstance(filename, file)--that I corrected by hand. There was also an issue, that made most of the tests crash, whereby we were failing to mark a file as binary when opening it, which resulted in encoding errors way down the line. It was no less a bug under Python 2, but apparently something in the py2 code was able to handle the inconsistency where py3 can't.

Demo

Here's some output from scripts/setup that shows some of the new bits:

Building django
Step 1 : FROM quay.io/azavea/django:1.11-python3.6-slim
1.11-python3.6-slim: Pulling from azavea/django
0a4690c5d889: Pull complete
9ef510f4d0f7: Pull complete
aeb0da8c8017: Pull complete
15b578954dd5: Pull complete
5ef54b845164: Pull complete
5bd3aea7e68c: Pull complete
5af23bcbddcd: Pull complete
aa18289755e8: Pull complete
bd3ee06bfc38: Pull complete
b9dea49ff9dc: Pull complete
Digest: sha256:6caf2edbb09795715e009e16f95614769e1f0e0fdbcf465814e6c376db722a5b
Status: Downloaded newer image for quay.io/azavea/django:1.11-python3.6-slim
 ---> 135bdf86da47
Step 2 : MAINTAINER Azavea
 ---> Running in a32df207c2a0
 ---> d74149370a15
Removing intermediate container a32df207c2a0
Step 3 : RUN pip3 install --upgrade pip
 ---> Running in 67a7f4f2f0fd
Collecting pip
  Downloading https://files.pythonhosted.org/packages/8d/07/f7d7ced2f97ca3098c16565efbe6b15fafcba53e8d9bdb431e
09140514b0/pip-19.2.2-py2.py3-none-any.whl (1.4MB)
Installing collected packages: pip
  Found existing installation: pip 19.2.1
    Uninstalling pip-19.2.1:
      Successfully uninstalled pip-19.2.1
Successfully installed pip-19.2.2
 ---> 633d39f74dde
Removing intermediate container 67a7f4f2f0fd
Step 4 : COPY requirements.txt /tmp/
 ---> 4ab81b6cb9d1
Removing intermediate container 7b3872b9dea8
Step 5 : RUN pip3 install --no-cache-dir -r /tmp/requirements.txt
 ---> Running in 8d66f9d1bce8

There's no way to actually tell, but this was created with a Python 3 analysis and served with a Python 3 API: image

Notes

Testing Instructions

Checklist

Resolves #751

KlaasH commented 5 years ago

Per discussion yesterday, upgrading python made the OSM extract caching setup start failing.

When there's no file present in the cache bucket, the analysis job writes a lockfile to the cache bucket, confirms it got the lock, then downloads the OSM extract from Geofabrik, uploads it to the bucket, and finally removes the lockfile.

The step where it confirms it got the lock was failing, because it turns out the read() method on the object returned by boto3's S3 client returns a bytestring. So b'unique identifier' was being compared to 'unique identifier' and not matching, and the script concluded that some other job got the lock and it should wait for that one to finish.

Here's a snippet that shows the behavior (fill in your bucket name), which you can run anywhere you have a Python 3 environment with boto3 installed (e.g. ./scripts/console django):

import boto3
bucket = 'YOUR_STORAGE_BUCKET'

s3_client = boto3.client('s3')
key = 'test_file'
content = "This is a file with words in it."
s3_client.put_object(Bucket=bucket, Key=key, Body=content)
downloaded = s3_client.get_object(Bucket=bucket, Key=key)['Body'].read()
if downloaded != content:
    print("downloaded text doesn't match uploaded")
if downloaded.decode('utf-8') == content:
    print("but it does if you decode it")

I just pushed a fix (575f975), decoding the downloaded string.

ddohler commented 5 years ago

FWIW everything completed successfully and I was able to view the analysis results on the map.