hackforla / lucky-parking

Visualization of parking data to assist in understanding of the effects of parking policies on a neighborhood by neighborhood basis in the City of Los Angeles
https://www.hackforla.org/projects/lucky-parking.html
34 stars 60 forks source link

Transform `geometry.coordinates` field values for all documents #533

Closed glenflorendo closed 1 year ago

glenflorendo commented 1 year ago

User Story

As a developer, I want the citation dataset to have a valid geometry field, so that I can treat the data as GeoJSON and efficiently query with MongoDB.

Description

The database currently has a small subset of data (600,000 documents) as a result of #494.

In addition to loading sample data, @glenflorendo also updated each MongoDB document to have a geometry field with a type and coordinates like the following:

    "geometry" : {
        "type" : "Point",
        "coordinates" : [
            "1834342.762039",
            "6501348.482622"
        ]
    }

The coordinates were taken from the existing longitude and latitude fields. However, @glenflorendo didn't realize until later that these values were using a different coordinate system, the California State Plane Coordinate System (Zone 5).

This ticket involves updating all MongoDB documents, such that the geometry.coordinate field values are calculated to adhere to the World Geodetic System (WGS 64). In this system:

Acceptance Criteria

  1. All 600,000 documents in our database are properly transformed.
  2. The transformation/calculation logic should be saved somewhere as we will need this for our data pipeline.

Design References

No response

Technical References

Additional Information

No response

glenflorendo commented 1 year ago

@gregpawin Is this ticket something you can own or help with?

gregpawin commented 1 year ago

Sure, I can own this one.

gregpawin commented 1 year ago

Tried uploading geojson to MongoDB but it doesn't seem compatible. Also there is a lot of bloat/redundancy with geojson which means fewer records / byte.

New format:

[{"index":0,"ticket_number":"4273092401","state_plate":"NJ","make":"Toyota","body_style":"PA","color":"RD","location":"5032 MAPLEWOOD AVE W","violation_code":"80.69BS","violation_description":"NO PARK\/STREET CLEAN","fine_amount":73,"datetime":1451463360000,"make_ind":0,"latitude":34.0802958759,"longitude":-118.3125089233,"weekday":"Wednesday","geometry":{"type":"Point","coordinates":["-118.31250892329713","34.08029587586103"]}}, ... ]

We should change the categorical info into indexes.

gregpawin commented 1 year ago

Confirmed with @glenflorendo that records with transformed coordinates exist in the MongoDB database.