WRI-Cities / static-GTFS-manager

GUI interface for creating, editing, exporting of static GTFS data for a public transit authority
GNU General Public License v3.0
147 stars 46 forks source link

Dockerize #69

Open laidig opened 6 years ago

laidig commented 6 years ago

Looks like interesting work you have done. Would you accept a version that runs in Docker?

I could contribute it in the next couple weeks.

answerquest commented 6 years ago

@laidig gladly, Please do! Thanks in advance! And if possible also put a from-scratch sequence of commands for someone to install docker and deploy the dockerized app. And also how I can update it when there are changes (or I'll leave it to you for doing at major version changes). Please let me know what you need from my end. I can download and re-upload the file into the Releases section.

laidig commented 6 years ago

There are two components to Dockerizing:

  1. Making the Dockerfile that can build docker images and adding that to the repo
  2. Using that file to build and image and the push that to docker hub.

I'd take on 1, and then we can decide on 2. Of course, docs would go with both ;)

answerquest commented 6 years ago

@laidig I'll let you take lead on this as I'm currently uninitiated regarding docker. You mentioned docs.. do share clearly on what's needed.

laidig commented 6 years ago

I made a first pass at this: https://github.com/laidig/static-GTFS-manager/tree/docker

To run from Docker: docker pull laidig/static-gtfs-manager docker run -it -p 5000:5000 laidig/static-gtfs-manager

Your feedback, if you have any, is appreciated.

It can likely be refined, but I haven't played around with it enough to be sure that it is working well yet.

laidig commented 6 years ago

I made a second pass (same instructions), and the docker image is now half the size of the previous one. I might be able to get it smaller still.

answerquest commented 6 years ago

@laidig that's great.. thank you so much for giving your time for this. Sorry i'm not able to give this a spin for now, am working on bringing in another city's data format.

Question: if the program uses HDF5 file formats (.h5 files) to store and retrieve tables, then can a dockerized version of it run smoothly on windows? I'm exploring using HDF5, it's working fine in ubuntu but on windows I'm running into many issues with my old (and not updated) win7 boot.

Another query: If we make a dockerized version from an ubuntu OS, can it work in windows OS?

And then another query, dumbing it down even further: Can the docker version for windows run from double-clicking a .exe or shortcut? If not immediately, then is it possible to engineer such a solution?

laidig commented 6 years ago

This Docker image is running "Slim" Debian (not full Ubuntu) under another Host OS-- I'm using Mac OS. It works easily under Win10 or Windows Server because they have Hyper-V built in. It also works not as smoothly under Windows 7/8 via Docker Toolbox (https://docs.docker.com/toolbox/toolbox_install_windows/#step-2-install-docker-toolbox)

So yes, the file format should work when running on a Windows host because the code sees a Linux OS.

But that brings up another thing I have to resolve-- the current configuration I made doesn't necessarily save data across reboots-- I'll add a persistent volume to take care of that.

laidig commented 6 years ago

Follow up question to my last point: I'm assuming the persistent data is kept in the GTFS directory, correct?

answerquest commented 6 years ago

@laidig thanks, good to know it can work across OS's.

Yes, the program's persistent data stored in GTFS/db.json and GTFS/sequence.json. The other files there are actually artefacts from earlier development (I had started with working with the csv's earlier before moving to json) and I'd kept them around just to cross-check the data during development. The folders (yyyy-mm-dd-name) contain feed exports and aren't used again by the program (but are accessible to users at the program's home page in Browse.. section).

See Technical Overview wiki page for other details on how the program works.

answerquest commented 6 years ago

Hi @laidig , just a heads up, I'm working on a major overhaul that started with the way the DB is handled but has ended up including several improvements all over the place. I should be able to put something up by end of June 2018. There will be changes in the DB structure: now instead of two .json files there will be a variable number of .h5 files, one for each .txt file and thus different operators may having differing files.
Please explore if the docker thing will be able to support HDF5 format. It is not a pure-Python thing like TinyDB which was handling working with the .json files.

laidig commented 6 years ago

My guess is that it should not be a problem.

Do you have a commit I could test against?

Also, we can set up automatic builds from Github, so that with every commit to master, the Docker image is rebuilt.

On Mon, Jun 18, 2018 at 1:57 AM Nikhil VJ notifications@github.com wrote:

Hi @laidig https://github.com/laidig , just a heads up, I'm working on a major overhaul that started with the way the DB is handled but has ended up including several improvements all over the place. I should be able to put something up by end of June 2018. There will be changes in the DB structure: now instead of two .json files there will be a variable number of .h5 files, one for each .txt file and thus different operators may having differing files. Please explore if the docker thing will be able to support HDF5 format. It is not a pure-Python thing like TinyDB which was handling working with the .json files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WRI-Cities/static-GTFS-manager/issues/69#issuecomment-397949466, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwdPZNg00PepTs-52x07DtWjUt52rRrks5t90ExgaJpZM4TqlKQ .

answerquest commented 6 years ago

I should have a commit up by next week.

laidig commented 6 years ago

I saw your latest version and made an update of the docker image.

Do you have any set of tests (even at the level of: do this, expect this) for functionality to make sure that it's working?

answerquest commented 6 years ago

Hi @laidig , thanks for this! I was making changes myself for the next release.

From the pull request #102 I understand there's only a line to delete from .dockerignore and some edits to do in Dockerfile. Is that correct?

Asking because in the PR there's other files also getting involved so I'd rather make changes and push from my end.

answerquest commented 6 years ago

In the files and folders structure, I've renamed the GTFS folder to 'db; now. But all the other folders on the repo are also needed for the program. You had included GTFS under "volumes" heading in docker-compose.yml. Should the other folders be included there too?

laidig commented 6 years ago

Yes. Gtfs to db is an important change to make.

I can test the persistence later.

On Mon, Sep 10, 2018 at 11:11 PM Nikhil VJ notifications@github.com wrote:

In the files and folders structure, I've renamed the GTFS folder to 'db; now. But all the other folders on the repo are also needed for the program. You had included GTFS under "volumes" heading in docker-compose.yml. Should the other folders be included there too?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WRI-Cities/static-GTFS-manager/issues/69#issuecomment-420158330, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwdPYNQIIWKwTJl8teMi4USjoaBB5EXks5uZ1P6gaJpZM4TqlKQ .

answerquest commented 5 years ago

@laidig do I have to edit docker-compose.yml to specify which folders will have to be persistent?

laidig commented 5 years ago

Yes.

The term for persistence is called a Volume.

On Thu, Nov 8, 2018 at 4:44 AM Nikhil VJ notifications@github.com wrote:

@laidig https://github.com/laidig do I have to edit docker-compose.yml to specify which folders will have to be persistent?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WRI-Cities/static-GTFS-manager/issues/69#issuecomment-436981732, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwdPRbJfq2Ak7Uck_brRNABSg_Xqqfhks5utCcZgaJpZM4TqlKQ .

answerquest commented 5 years ago

Thanks @laidig for the clarification.

I want to be able to build and deploy this project in docker from source, instead of pulling an image from docker website/repo. I followed some leads given in this guide and from what is already shared here. Sharing a full report here. I'm able to make this run, but not able to have persistent storage yet.

Build

docker build -t wri-cities/static-gtfs-manager .

It installs and creates the docker images.

Run

docker run -it -p 5000:5000 "wri-cities/static-gtfs-manager"

That works, the program launches (but doesn't launch a browser tab, thats ok), I can now operate it on http://localhost:5000 .

But the storage isn't persistent! I make data changes (create a new frequency), exit the program by Ctrl+C in terminal, then if I run it again, all the data has been reset to original.

Contents of the docker files:

docker-compose.yml :

version: '3'
services:
    static-gtfs-manager:
        ports:
            - '5000:5000'
        image: wri-cities/static-gtfs-manager
        volumes:
          - db:/app/db/

volumes:
  db:

Additional query: I want to include my config folder also as a persistent volume. But in VSCode editor when I type in config it is highlighted as a keyword. Should I put "config": instead?

.dockerignore :

.git/*
export/**/*.txt
logs/*

Dockerfile :

FROM python:3.6-slim-stretch

RUN apt-get update && apt-get -y upgrade && \
    apt-get install -y python3-pip \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /app
WORKDIR /app
COPY . /app/
RUN pip3 install -r requirements.txt --user --no-cache-dir

EXPOSE 5000

CMD cd /app/ && python3 GTFSManager.py

Abridged terminal log of build process

there is something regarding tzdata that might be relevant:

$ docker build -t wri-cities/static-gtfs-manager .
Sending build context to Docker daemon  50.13MB
Step 1/8 : FROM python:3.6-slim-stretch
3.6-slim-stretch: Pulling from library/python
f17d81b4b692: Pull complete 
(...)
Digest: sha256:537edf25490a9e0685b512dcae76382d37c38c86a1c6221896f96ee6f8f02f19
Status: Downloaded newer image for python:3.6-slim-stretch
 ---> ffafb5882b66
Step 2/8 : RUN apt-get update && apt-get -y upgrade &&     apt-get install -y python3-pip     && rm -rf /var/lib/apt/lists/*
 ---> Running in ff8234c8fd52
Ign:1 http://deb.debian.org/debian stretch InRelease
Get:2 http://security.debian.org/debian-security stretch/updates InRelease [94.3 kB]
(...)
Reading package lists...
Building dependency tree...
Reading state information...
Calculating upgrade...
The following packages will be upgraded:
  tzdata
1 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
Need to get 270 kB of archives.
(...)
Setting up tzdata (2018g-0+deb9u1) ...
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (Can't locate Term/ReadLine.pm in @INC (you may need to install the Term::ReadLine module) (@INC contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.24.1 /usr/local/share/perl/5.24.1 /usr/lib/x86_64-linux-gnu/perl5/5.24 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.24 /usr/share/perl/5.24 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .) at /usr/share/perl5/Debconf/FrontEnd/Readline.pm line 7.)
debconf: falling back to frontend: Teletype

Current default time zone: 'Etc/UTC'
Local time is now:      Fri Nov  9 03:45:20 UTC 2018.
Universal Time is now:  Fri Nov  9 03:45:20 UTC 2018.
Run 'dpkg-reconfigure tzdata' if you wish to change it.

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:

(... similar to other installations on ubuntu ...)

Removing intermediate container ff8234c8fd52
 ---> 60fdfb8ccadc
Step 3/8 : RUN mkdir -p /app
 ---> Running in 6a41e56213ce
Removing intermediate container 6a41e56213ce
 ---> d77d937ac407
Step 4/8 : WORKDIR /app
Removing intermediate container 5802f8b1aaeb
 ---> 64a105c4a8af
Step 5/8 : COPY . /app/
 ---> 09926eab527a
Step 6/8 : RUN pip3 install -r requirements.txt --user --no-cache-dir
 ---> Running in 17f0578fc86b
(...)
Removing intermediate container 17f0578fc86b
 ---> 531b4059f3ca
Step 7/8 : EXPOSE 5000
 ---> Running in 47aeb4bd1a35
Removing intermediate container 47aeb4bd1a35
 ---> 1af2e47fedb9
Step 8/8 : CMD cd /app/ && python3 GTFSManager.py
 ---> Running in 0d1c699beaf7
Removing intermediate container 0d1c699beaf7
 ---> 87b63bf07598
Successfully built 87b63bf07598
Successfully tagged wri-cities/static-gtfs-manager:latest

So, the main question : I have specified db/ folder as a persistent volume in docker-compose.yml . But that's not doing the job apparently. What will it take?

answerquest commented 5 years ago

Update: I wasn't able to figure out anything solid from the docs or the similar questions posted on stackoverflow, but through trial and error I have managed to achieve persistence of data by modifying the run command, adding a -v key. For completeness, including the build command preceding it too:

docker build -t wri-cities/static-gtfs-manager .
docker run -it -p 5000:5000 -v persistent:/app/db "wri-cities/static-gtfs-manager"

the 'persistent' word up there can be anything, it's a label. And if you use another label, that will start a different persistent data store, so with that there's an opportunity:

Opportunity for multiple feeds management through docker

If running this tool in docker, users can keep multiple GTFS feeds or versions loaded through changing the label after -v in the run command. One can do this simultaneously by changing the left-side port number after -p for successive runs

Example:

Ignore the in-program URL shared; through docker you can now have two different instances of static-GTFS-Manager running with separate databases on http://localhost:5000 and http://localhost:5001 .

You can list the volumes created with this command: docker volume ls.
To see a list of commands, do docker volume
To manually browse these, open this path in a file browser with root permissions: /var/lib/docker/volumes

Of course, if running this directly from python or windows exe you can simply clone the folder and do different business in different folders ;)


More Questions arise

Ok, after play time, this still raises some questions that I don't have an answer to right now:

answerquest commented 5 years ago

Update : I got rid of docker-compose.yml and tried the build and run commands again:

docker build -t static-gtfs-manager .
docker run -it -p 5000:5000 -v noyml:/app/db static-gtfs-manager

... and it worked! So it seems docker-compose.yml goes with the docker-compose up command and is needed if the docker-image is up online.

answerquest commented 5 years ago

Edit: Update for continuity for readers: Docker business sorted out! See Running with Docker on any OS . Fresh work was done at #129 , #130

@laidig pulling your comment under Packaging for Linux to here to continue the conversation here.

Nice! I’m away from my computer for the next week, do you want to make an image in your own repository?
Now that you’re comfortable with Docker, you can also have it build automatically with every commit to master on Github.