CityOfLosAngeles / dot-planner

Planning software and database for Active Transportation Projects
Apache License 2.0
9 stars 7 forks source link

Loading Data #144

Closed hunterowens closed 7 years ago

hunterowens commented 7 years ago

@spencercharest @datala/ucla-dot-partners

How did you load in the data into postgres for the demo application?

emrzero commented 7 years ago

Spencer configured the demo application server, but I imagine he ran a script similar to this one:

https://github.com/datala/dot-planner/blob/etl/etl/load.js

It requires the Sequelize npm package

hunterowens commented 7 years ago

ah ya looks like the etl branch was never merged. Digging into it now.

hunterowens commented 7 years ago

@emrzero @spencercharest Do y'all wanna merge in ETL and add a load procedure to the readme (b/c I have the data with sensitive info) or should I be doing this?

emrzero commented 7 years ago

if there's no rush, I can do it this weekend.

hunterowens commented 7 years ago

@emrzero I've got to the point where I have a script that unzips the files, converts them to geojson and saves to a dir.

You can checkout etl/load.js. Would love your help on it if I don't finish it before the weekend. Thanks!

bryce-richards commented 7 years ago

@hunterowens can I get a copy of the old ATB database zip file?

hunterowens commented 7 years ago

@bryce-richards you should have the file now.

@emrzero would love your help, take a look at etl on the master branch.

bryce-richards commented 7 years ago

@hunterowens got file unzipped on the etl branch. do you have the ogr2ogr command for bringing in all that data? i've got gdal installed but would appreciate some direction on the next steps. thanks.

bryce-richards commented 7 years ago

#144 - Loading Data

hunterowens commented 7 years ago

Checkout out the bash script I was working on in the etl directory.

On Feb 18, 2017 19:04, "Bryce Richards" notifications@github.com wrote:

https://camo.githubusercontent.com/193766a3b9959c5f4ed5cd8cf3251d015e839d51/68747470733a2f2f6769746875622e7472656c6c6f2e73657276696365732f696d616765732f6d696e692d7472656c6c6f2d69636f6e2e706e67 #144

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datala/dot-planner/issues/144#issuecomment-280892223, or mute the thread https://github.com/notifications/unsubscribe-auth/ABp74li-rPnjyzinojO-W_jv-yX7KpUXks5rd7FBgaJpZM4L_p5y .

hunterowens commented 7 years ago

@bryce-richards name, here's how you take each featureLayer in the gdb and turn it into a geojson file.


declare -a arr=("f_multipoints"
                "f_points"
                "f_lines"
                "f_polygon"
                "uf_points"
                "uf_lines"
                "uf_polygon")

## now loop through the above array to make geojson files
for i in "${arr[@]}"
do
  echo "$i"
  ogr2ogr -sql "SELECT * FROM $i" -f "GeoJSON" atd_$i.geojson ATD_projectdatabase.gdb
done

The meat is ogr2ogr -sql "SELECT * FROM $i" -f "GeoJSON" atd_$i.geojson ATD_projectdatabase.gdb

Which essentially says - SELECT everything from a layer, convert to geojson, save with name atd_layername.geojson from the geodatabase file.

hunterowens commented 7 years ago

@emrzero were you able to look at this at all?

bryce-richards commented 7 years ago

@spencercharest @emrzero any advice on how you guys loaded the seeds file from projects.js into the database models after migrating?

emrzero commented 7 years ago

@hunterowens I did not. However, I'm wondering if the database you have is the same as the one we got. I remember gathering the geometry data from one table and the project details from another.

@bryce-richards I can describe the process if that's what you're looking for:

  1. Using ArcGIS we copied the tables to a PostgreSQL db.
  2. Query the db and create a JSON object for each record, which is saved to a flat file. Files are separated by shape.
  3. Using the Sequelize ORM, load each record in to the db -- the method is Project.create({data})

The reasons for doing the migration this way:

I spent many hours trying to massage the data, mostly because ArcGIS and Leaflet.js are so distinct.

bryce-richards commented 7 years ago

@emrzero I'll look into ArcGIS. I've already got the tables set up in a PostgreSQL database from the sequelize db:migrate command. Looks like the projects.js file already has the JSON objects created as well. So it seems like I just need to load each record into the database. Is there a way you guys did it en masses, as there are about 220 project objects in the file.

bryce-richards commented 7 years ago

@emrzero just trying to get some working data on my local repo

hunterowens commented 7 years ago

@emrzero Shoot me your email and I can send you the version of the files I have. Would be interested to see if your method works.

emrzero commented 7 years ago

@bryce-richards to load data I have a js script:

var models = require('../models/');
var dataSet = require('./projects.js');

for (var i = 0; i < dataSet.length; i++) {
    models.Project.create(dataSet[i]);
}

The script expects the projects.js file to be an array of JSON objects. I prepped this file manually. Here's what it should look like

emrzero commented 7 years ago

@hunterowens eliotmij@gmail.com

hunterowens commented 7 years ago

Data package sent. Let me know if it matches yours

On Tue, Feb 21, 2017 at 8:41 PM, emrzero notifications@github.com wrote:

@hunterowens https://github.com/hunterowens eliotmij@gmail.com

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datala/dot-planner/issues/144#issuecomment-281567429, or mute the thread https://github.com/notifications/unsubscribe-auth/ABp74rJzH3caTr9cHkRHn1ZWL8vaJ_wgks5re7xwgaJpZM4L_p5y .

bryce-richards commented 7 years ago

@emrzero thank you

emrzero commented 7 years ago

@bryce-richards no problem

@hunterowens the db is different. This one has project details and geometry data in one table.

hunterowens commented 7 years ago

@emrzero Thanks for your help. I'll work on the new db independently then.

-Hunter

On Tue, Feb 21, 2017 at 9:08 PM, emrzero notifications@github.com wrote:

@bryce-richards https://github.com/bryce-richards no problem

@hunterowens https://github.com/hunterowens the db is different. This one has project details and geometry data in one table.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datala/dot-planner/issues/144#issuecomment-281570915, or mute the thread https://github.com/notifications/unsubscribe-auth/ABp74uzkBdcgc1NPuPzl2Rmu6EfjYLG9ks5re8KzgaJpZM4L_p5y .

hunterowens commented 7 years ago

Done!