Closed cypressf closed 3 years ago
Looks like if I use sqlx, and want to run the migration in main.rs or something, I should use this workaround for an existing bug: https://github.com/launchbadge/sqlx/issues/863
I'm trying out sqlx-cli
.
CREATE USER 'mit'@'localhost' IDENTIFIED BY '<password>';
GRANT SELECT, INSERT, UPDATE, DELETE ON `climate_risk_map`.* TO 'mit'@'localhost'
sqlx database create
but I get
error: error returned from database: 1044 (42000): Access denied for user 'mit'@'localhost' to database 'climate_risk_map'
Even after adding the proper database url to my environment vars:
DATABASE_URL=mysql://mit:<password>@localhost/climate_risk_map
Derp. I forgot to grant CREATE permissions to that user.
CREATE USER 'mit'@'localhost' IDENTIFIED BY '<password>';
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP ON `climate_risk_map`.* TO 'mit'@'localhost'
set DATABASE_URL=mysql://mit:<password>@localhost/climate_risk_map
in .env
then
sqlx database create
works just fine!
I'm using https://github.com/jamesjmeyer210/actix_sqlx_mysql_user_crud as an example repo to work off of, for using sqlx to read from the database. Currently, I'm adapting it to work with sqlx 0.5 (it was written for sqlx 0.3).
The MySqlRow
object in sqlx 0.5 no longer accepts a lifetime annotation so I need to rework the Table
class to work with that.
I reworked the example of the actix CRUD app to work with the latest version of sqlx, and pushed the changes to https://github.com/cypressf/climate-risk-map-service. I also created an initial db schema. I'm working on prepopulating it with initial data. Next step will be to create a rest service to get that data, and run it on svante in a container.
Working on inserting all the county-level data now. I am creating a .csv file with all the data we currently display in the website, in a format that matches the schema of the county_data table. Work for that is in this python notebook. I am now in the process of trying to ingest the resulting csv file into mysql. It's quite large so I ran into issues with sqlx refusing to run it as a sql command.
I split the sql file into 3 parts and sqlx was happy to run those. The commands I used to split the file and reformat for sql insert are in the python notebook.
select t1.*, t2.*, t3.*
from
map_visualization_collection as t1,
map_visualization as t2,
dataset as t3
where t1.category = 1
and t2.id = t1.map_visualization
and t3.id = t2.dataset
order by t1.order
I'm trying to install this on heroku for the time being, so we can get it up-and-running on the internet before the svante container is ready. Following the rust buildpack instructions
release: ./target/release/diesel migration run
to the Procfile
)I'll need to make the data endpoint more efficient. Probably best to get all the data in csv format, or something even more efficient, instead of json. I think csv strikes a good balance between human-readable (can be used by someone who's just poking at the API) and compact. With gzip it will definitely be faster to compile csv-formatted data from the database for a given website tab, or a particular data-selection.
The question is what format should I use for years and sources. I could put the data in a separate column for each year and source, or I could add a year column and a source column. A year and source column is cleaner (it is tidy data), but it could cause the file size to be quite large if the years and sources aren't shared amongst different variables. For example, "water quality" comes from the EPA, whereas "flood risk" comes from first street, therefore, a measurement for a specific county's water quality would have to be specified on a different line from the flood risk, because the "dataset" column would be different.
I moved from mysql to postgres so we can use PostGIS (for storing GIS data) in the future.
https://github.com/ncrmro/planet-express is an example app running on almost the same tech stack. They use docker instead of podman, but I think I should be able to adapt that to work with podman. They claim the code is easily deployable to heroku, so I'm gonna see exactly how they do that.
I installed the thin podman client on my local macOS machine, and podman on centos 8 linux vagrant box:
brew install podman # install podman and vagrant
brew install vagrant
vagrant init generic/centos8 # setup vagrant box
ssh-copy-id -i ~/.ssh/id_rsa.pub -p 2200 vagrant@127.0.0.1 # the default password is vagrant. if not, run vagrant ssh, then manually copy the id_rsa.pub to ~/.ssh/authorized_keys
ssh -p 2200 vagrant@127.0.0.1
now on the vagrant ssh session, install podman and run the remote server
# install the latest version of podman, to be compatible with the client installed by home-brew
sudo dnf -y module disable container-tools
sudo dnf -y install 'dnf-command(copr)'
sudo dnf -y copr enable rhcontainerbot/container-selinux
sudo curl -L -o /etc/yum.repos.d/devel:kubic:libcontainers:stable.repo https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/CentOS_8/devel:kubic:libcontainers:stable.repo
sudo dnf -y --refresh install podman
# run the podman remote
systemctl --user enable --now podman.socket
sudo loginctl enable-linger $USER # keep it running on logout
exit # back to macOS host
and finally setup the podman thin client on macOS
podman system connection add vagrant ssh://vagrant@127.0.0.1:2200
podman info # this should work! now you can use any podman commands
sources:
how to use podman remote client how to install podman on centos how to get started with vagrant
Here's a test to run an http server inside a podman container inside a vagrant box, and curl
it from the host.
forward port 8080 on the host to port 8080 in the vagrant box by adding the following to Vagrantfile
config.vm.network "forwarded_port", guest: 8080, host: 8080, host_ip: "127.0.0.1"
open port 8080 on the vagrant box
ssh -p 2200 vagrant@127.0.0.1
sudo firewall-cmd --zone=public --permanent --add-port 8080/tcp
firewall-cmd --reload
exit
run a simple http server in a podman container in the vagrant box via the podman thin client on the host, and GET
it
podman run -dt -p 8080:80/tcp docker.io/library/httpd
curl localhost:8080
<html><body><h1>It works!</h1></body></html>
using https://hub.docker.com/r/htli/actix for now
podman pull docker.io/htli/actix
I met with @mjbludwig and he's going to make a podman pod with a couple containers (for the web service, postgres, and the frontend static files).
sqlx migrate run
every time it relaunches so it always applies the latest changes to the database/api/*
to the web service and all other requests /*
it will serve static files from the frontendcurl https://mst.mit.edu/api/state
for example, to get a list of statesfrom @mjbludwig:
I got a postgres container working with a container with your app in it and making a database that is mounted on the host! I could curl the state bits from inside a container in the pod, next is to be able to curl from the host.
from @mjbludwig:
podman pod create --name crm_pod -p 8000:8000 # This makes a pod called "crm_pod" and maps port 8000 inside the pod to 8000 outside
podman run --pod=crm_pod -v /home/mludwig/Work/JP/climate_risk_map/crm_db/:/var/lib/postresql/data:Z -e POSTGRES_PASSWORD=password -e POSTGRES_USER=username --name crm_db -d postgres
You will need to change the-v
bits to a directory where the database will live on your local machine so change/home/mludwig/Work/JP/climate_risk_map/crm_db/
to somewhere localsave this as a script:
#!/bin/bash ctr=climate_risk_map_service buildah from --name $ctr rust buildah run $ctr cargo install sqlx-cli buildah run $ctr /bin/bash -c 'cd /opt; git clone https://github.com/cypressf/climate-risk-map-service.git' buildah run $ctr /bin/bash -c 'echo APP_URL=127.0.0.1 >> /opt/climate-risk-map-service/.env' buildah run $ctr /bin/bash -c 'echo APP_PORT=8000 >> /opt/climate-risk-map-service/.env' buildah run $ctr /bin/bash -c 'echo DATABASE_URL=postgres://username:password@localhost/crm_db >> /opt/climate-risk-map-service/.env' buildah config --entrypoint "cd /opt/climate-risk-map-service; sqlx database create && sqlx migrate run && cargo run" $ctr buildah commit $ctr $ctr
- run script.
podman run -d --pod=crm_pod --name crm_ctr climate_risk_map_service
Now wait about a minute and finally:podman exec -it crm_ctr bash
then you cancurl localhost:8000/state
In order for me to run the script in step 3 of my last comment, I must install buildah
. I'm unsure if I can run a thin client on mac like I can with podman
to modify the container on my vagrant box. Checking now.
I'm installing buildah
on my centOS box using the instructions from their website.
# CentOS 8
sudo dnf -y module disable container-tools
sudo dnf -y install 'dnf-command(copr)'
sudo dnf -y copr enable rhcontainerbot/container-selinux
sudo curl -L -o /etc/yum.repos.d/devel:kubic:libcontainers:stable.repo https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable/CentOS_8/devel:kubic:libcontainers:stable.repo
# OPTIONAL FOR RUNC USERS: crun will be installed by default. Install runc first if you prefer runc
sudo dnf -y --refresh install runc
# Install Buildah
sudo dnf -y --refresh install buildah
In the script above, I would modify to use a fully-qualified container name docker.io/library/rust
instead of just rust
in the line
buildah from --name $ctr rust
After I run podman run -d --pod=crm_pod --name crm_ctr climate_risk_map_service
and wait about a minute, when I run the final step I get an error
podman exec -it crm_ctr bash
Error: can only create exec sessions on running containers: container state improper
I get the following with podman ps
d5e0f1c0a470 docker.io/htli/actix:latest bash 7 days ago Up 7 days ago 0.0.0.0:8080->80/tcp priceless_gould
c760645b5fa8 k8s.gcr.io/pause:3.5 6 days ago Up 6 days ago 0.0.0.0:8000->8000/tcp d1d6ccd75110-infra
(I think those are some old containers I was messing around with)
I deleted all my pods and containers and tried again. I got it to work this time. I also had to change curl localhost:8000/state
to curl 0.0.0.0:8000/state
, probably because the name wasn't mapped on my centos install in hosts or something
I could successfully curl from a bash
running in the container, but like I can't curl from the CentOS host:
curl 0.0.0.0:8000/state
curl: (56) Recv failure: Connection reset by peer
Running containers look good to me.
podman ps -p
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES POD ID PODNAME
0f161fcbc250 k8s.gcr.io/pause:3.5 33 minutes ago Up 30 minutes ago 0.0.0.0:8000->8000/tcp 42239f089eae-infra 42239f089eae crm_pod
ca7c7d713e6f docker.io/library/postgres:latest postgres 30 minutes ago Up 30 minutes ago 0.0.0.0:8000->8000/tcp crm_db 42239f089eae crm_pod
0e86d40e022b localhost/climate_risk_map_service:latest bash 11 minutes ago Up 11 minutes ago 0.0.0.0:8000->8000/tcp crm_ctr 42239f089eae crm_pod
running with APP_URL=0.0.0.0
instead of 127.0.0.1
fixed it!! I got a successful curl from the CentOS host
podman run -d --pod=crm_pod --name crm_ctr -e APP_URL=0.0.0.0 climate_risk_map_service
# wait about a minute... and then
curl 0.0.0.0:8000/state
[{"id":1,"name":"Alabama"},{"id":2,"name":"Alaska"},{"id":4,"name":"Arizona"},{"id":5,"name":"Arkansas"},{"id":6,"name":"California"},{"id":8,"name":"Colorado"},{"id":9,"name":"Connecticut"},{"id":10,"name":"Delaware"},{"id":11,"name":"D.C."},{"id":12,"name":"Florida"},{"id":13,"name":"Georgia"},{"id":15,"name":"Hawaii"},{"id":16,"name":"Idaho"},{"id":17,"name":"Illinois"},{"id":18,"name":"Indiana"},{"id":19,"name":"Iowa"},{"id":20,"name":"Kansas"},{"id":21,"name":"Kentucky"},{"id":22,"name":"Louisiana"},{"id":23,"name":"Maine"},{"id":24,"name":"Maryland"},{"id":25,"name":"Massachusetts"},{"id":26,"name":"Michigan"},{"id":27,"name":"Minnesota"},{"id":28,"name":"Mississippi"},{"id":29,"name":"Missouri"},{"id":30,"name":"Montana"},{"id":31,"name":"Nebraska"},{"id":32,"name":"Nevada"},{"id":33,"name":"New Hampshire"},{"id":34,"name":"New Jersey"},{"id":35,"name":"New Mexico"},{"id":36,"name":"New York"},{"id":37,"name":"North Carolina"},{"id":38,"name":"North Dakota"},{"id":39,"name":"Ohio"},{"id":40,"name":"Oklahoma"},{"id":41,"name":"Oregon"},{"id":42,"name":"Pennsylvania"},{"id":44,"name":"Rhode Island"},{"id":45,"name":"South Carolina"},{"id":46,"name":"South Dakota"},{"id":47,"name":"Tennessee"},{"id":48,"name":"Texas"},{"id":49,"name":"Utah"},{"id":50,"name":"Vermont"},{"id":51,"name":"Virginia"},{"id":53,"name":"Washington"},{"id":54,"name":"West Virginia"},{"id":55,"name":"Wisconsin"},{"id":56,"name":"Wyoming"},{"id":60,"name":"American Samoa"},{"id":66,"name":"Guam"},{"id":69,"name":"Northern Mariana Islands"},{"id":72,"name":"Puerto Rico"},{"id":74,"name":"U.S. Minor Outlying Islands"},{"id":78,"name":"U.S. Virgin Islands"}]
also when testing curling within the container, I simplified
podman exec -it crm_ctr bash
curl 0.0.0.0:8000/state
exit
to
podman exec -it crm_ctr curl 0.0.0.0:8000/state
@mjbludwig I moved the backend and the frontend to the same repo: https://github.com/cypressf/climate-risk-map
all frontend code is in frontend
directory and all backend code is in backend
The container should run yarn build
in the /frontend
directory, which will create an optimized production build in the /frontend/build
directory
@mjbludwig, here's are the full steps for installing the needed dependencies in CentOS and building the frontend. I tested in a fresh CentOS 8 box. This uses the node binary sources from the node source repo.
curl -fsSL https://rpm.nodesource.com/setup_16.x | sudo bash -
curl -sL https://dl.yarnpkg.com/rpm/yarn.repo | sudo tee /etc/yum.repos.d/yarn.repo
sudo yum install -y yarn
sudo yum install -y git
git clone git@github.com:cypressf/climate-risk-map.git
cd climate-risk-map/frontend
yarn install
yarn build
After yarn build
is complete, you'll have a climate-risk-map/frontend/build
directory, which contains the index.html
and all the other static resources.
Revisiting https://github.com/cypressf/mit-climate-data-viz/issues/179#issuecomment-864287181, I'm going to start out by using tidy data, and returning a start_date, end_date, source, and value column. If the file size is too large, I might reconsider.
HttpResponse::Ok().body()
to set the body of the response without specifying json content type: https://actix.rs/docs/response/county_id, state_id, source, start_date, end_date, value
GET 127.0.0.1:8080/data/1
size | time | |
---|---|---|
json | 1780 KB | 1151 ms |
csv | 608 KB | 1009 ms |
@mjbludwig mentioned via email that the website is now deployed in a container at https://svante3.mit.edu. @mjbludwig, do you know if the actix service is accessible on https://svante3.mit.edu as well? https://svante3.mit.edu/api/[endpoint] would be a good endpoint for it.
Once the service is accessible on the same subdomain, it should be a straightforward path to switch over to use the database from the csv files.
Using https://github.com/cypressf/climate-risk-map-service as a repo for this
Rust web service with the actix framework, using sqlx, and the sqlx migration tool, and loading database config from environment variables.
/api
endpoint