GreenInfo-Network / nyc-crash-mapper-chart-view

Chart view for NYC Crash Mapper that allows for viewing Trends, Comparing, and Ranking of various NYC geographies
http://vis.crashmapper.org
MIT License
2 stars 1 forks source link

Add Vehicle Type filter #108

Closed danrademacher closed 4 years ago

danrademacher commented 4 years ago

Over in https://github.com/GreenInfo-Network/nyc-crash-mapper-etl-script/issues/22, we started pulling in and allocating Vehicle Type data. Now we need to add a filter for it.

Basically this: image

In context of the whole sidebar on the chart app: image

It fits between current "Filter by Type" and "Filter by Boundary", and then rename the label on "Filter by Type" to "Filter by Crash Type"

We might also want to change at least some of the panels to default closed, if there's an existing way to do that, but that's minor need.

gregallensworth commented 4 years ago

On the new gda-vehtypefilter branch, I have the vehicle-type filter basically working. The panel exists, and its state is tracked, and its state goes into URL params. The fetchEntityData() function and all areas where it is called, now pass in a copy of the filterVehicle so the SQL queries may be customized based on this panel's state.

The only part that seems dodgy at the moment, is App.jsx componentWillReceiveProps() in which I just reload the page in order to apply the new stats.

This is a bit of a Band-Aid fix, obviously not as great as having the components update from the updated data. But thus far, even when I have additional conditions causing fetchEntityData() to reload and to apply the new filtering SQL clauses, the data just don't find their way into the charts. Per verbal chat, let's check with the client before sinking more time into this.

danrademacher commented 4 years ago

In local testing, I find the reload behavior acceptable if annoying, since it is the same as we get when switching boundary types.

But during local review, we uncovered a more problematic error where interacting with the Select boundary Type menu produces a console error and no change on page: image

If I proceed to select a specific boundary, like Council District Whatever, and reload the page, that pushes through the SQL to try to grab data, and causes this bad request: https://chekpeds.carto.com/api/v2/sql?q=SELECT+COUNT(c.cartodb_id)+as+total_crashes,+neighborhood+AS+neighborhood,+SUM(c.number_of_cyclist_injured)+as+cyclist_injured,+SUM(c.number_of_cyclist_killed)+as+cyclist_killed,+SUM(c.number_of_motorist_injured)+as+motorist_injured,+SUM(c.number_of_motorist_killed)+as+motorist_killed,+SUM(c.number_of_pedestrian_injured)+as+pedestrian_injured,+SUM(c.number_of_pedestrian_killed)+as+pedestrian_killed,+SUM(c.number_of_pedestrian_injured+%2B+c.number_of_cyclist_injured+%2B+c.number_of_motorist_injured)+as+persons_injured,+SUM(c.number_of_pedestrian_killed+%2B+c.number_of_cyclist_killed+%2B+c.number_of_motorist_killed)+as+persons_killed,+year+%7C%7C+%27-%27+%7C%7C+LPAD(month::text,+2,+%270%27)+as+year_month+FROM+crashes_all_prod+c+WHERE+the_geom+IS+NOT+NULL+AND+neighborhood+IS+NOT+NULL+AND+neighborhood::text+!%3D+%27%27+AND+AND+borough+IS+NOT+NULL+AND+borough+!%3D+%27%27+AND+(hasvehicle_car+OR+hasvehicle_truck+OR+hasvehicle_motorcycle+OR+hasvehicle_bicycle+OR+hasvehicle_busvan+OR+hasvehicle_scooter+OR+hasvehicle_other+OR+(+NOT+hasvehicle_car+AND+NOT+hasvehicle_truck+AND+NOT+hasvehicle_motorcycle+AND+NOT+hasvehicle_bicycle+AND+NOT+hasvehicle_suv+AND+NOT+hasvehicle_busvan+AND+NOT+hasvehicle_scooter+AND+NOT+hasvehicle_other+))+GROUP+BY+year_month,+neighborhood+ORDER+BY+year_month+asc,+neighborhood+asc

The return on that is

{
"error": [
"syntax error at or near \"AND\""
]
}

Ah, I see the error. Somehow the query includes +AND+AND+ between neighborhood::text+!%3D+%27%27+ and borough+IS+NOT+NULL

Removing one of those duplicate ANDs results in a successful query to CARTO's API.

Not sure how that's getting in there, however. Seems so distant from adding the vehicle type filters later in the query!

danrademacher commented 4 years ago

Just reviewed this with client. Overall, very pleased with our progress. But setting aside the actual SQL error for a moment, in actually use of the vehicle type filter here, my "acceptable if annoying" assessment of the data reloading seems optimistic. The important difference between this and boundary type is that for boundary type, I make one selection, then things change. For Vehicle Type, as I try to toggle off multiple types, it's not easy to track what is actually changing on the chart. Feels almost like we need either to figure out a solution to realtime filter or come up with a "make selections, hit submit" which is a totally new UX to this app, so not to be undertaken lightly.

Mulling over how to best proceed here.

clhenrick commented 4 years ago

From what I remember, this app relies more on doing data transformation and filtering in the browser, where as the original crash mapper app relies on SQL queries passed to CARTO for transforming and filtering data.

I remember there being some reselect hackery in this app that caches computations in order to save data processing steps. I can take a look sometime to figure out where the bottle neck might be.

clhenrick commented 4 years ago

The problem with the filter by vehicle type UX is that because all the standardized vehicle types are being stored as boolean columns and aren't being returned with the aggregate query used by the app.

For example, the query for "city wide" is as follows:

  SELECT
    COUNT(c.cartodb_id) as total_crashes,
    SUM(c.number_of_cyclist_injured) as cyclist_injured,
    SUM(c.number_of_cyclist_killed) as cyclist_killed,
    SUM(c.number_of_motorist_injured) as motorist_injured,
    SUM(c.number_of_motorist_killed) as motorist_killed,
    SUM(c.number_of_pedestrian_injured) as pedestrian_injured,
    SUM(c.number_of_pedestrian_killed) as pedestrian_killed,
    SUM(c.number_of_pedestrian_injured + c.number_of_cyclist_injured + c.number_of_motorist_injured) as persons_injured,
    SUM(c.number_of_pedestrian_killed + c.number_of_cyclist_killed + c.number_of_motorist_killed) as persons_killed,
    year || '-' || LPAD(month::text, 2, '0') as year_month
  FROM
    crashes_all_prod c
  GROUP BY year, month
  ORDER BY year asc, month asc

In order to include the vehicle types so that the filtering can happen client side with out a data refresh, these columns would need to be included as part of the aggregate query instead of as a WHERE clause.

If the vehicle type boolean columns were converted to an array of strings, or perhaps a concatenated string, then they could be filtered by the app client side to achieve a UX similar to the filter by crash.

For example if the response included a value for "all vehicle types" involved for a given year and month :

{
  involved_vehicle_types: ["car", "truck", "suv", "bicycle"],
  total_crashes: 9505
  cyclist_injured: 367
  cyclist_killed: 3
  motorist_injured: 2801
  motorist_killed: 8
  pedestrian_injured: 845
  pedestrian_killed: 3
  persons_injured: 4013
  persons_killed: 14
  year_month: "Mon Aug 01 2011 00:00:00 GMT-0700 (Pacific Daylight Time)"
}

The SQL for the SELECT statement might look like the following:

SELECT
CASE WHEN "hasvehicle_car" THEN 'car ' ELSE '' END ||
CASE WHEN "hasvehicle_truck" THEN 'truck ' ELSE '' END ||
CASE WHEN "hasvehicle_motorcycle" THEN 'motorcyle ' ELSE '' END ||
CASE WHEN "hasvehicle_bicycle" THEN 'bicycle ' ELSE '' END ||
CASE WHEN "hasvehicle_suv" THEN 'suv ' ELSE '' END ||
CASE WHEN "hasvehicle_busvan" THEN 'busvan ' ELSE '' END ||
CASE WHEN "hasvehicle_scooter" THEN 'scooter ' ELSE '' END ||
CASE WHEN "hasvehicle_other" THEN 'other ' ELSE '' END
AS all_vehicle_types
FROM crashes_all_prod

which would return a string for each row like "truck suv other" that could then be used to filter on in the app, similar to crash type.

clhenrick commented 4 years ago

Update: thinking about this more, the above SQL tactic would not work as the data aggregation needs to happen after the data filter. In other words, to filter by vehicle type would require fetching the data unaggregated and then filtering, aggregating it client side.

clhenrick commented 4 years ago

Pushed some changes that add a new action and update the data reducer for clearing the data cache. This action is fired by a Redux middleware when the vehicle filters change. It seems to be working for "Trend" and "Rank" views, but it doesn't seem to correctly update the reference entity in the "Compare" view. Will require some more investigation. cc @gregallensworth

danrademacher commented 4 years ago

These changes are working as expected for except, as Chris described, for the Citywide reference element on the Compare view.

If I have a filter in place from Trend and navigate to Compare, it seems to work as expected: image

But if I change the filter in that view, I get 0's for Citywide: image

I suspect that is because we are using a second SQL API call there to get those topline numbers: image

Here's an example of a query for that that works+GROUP+BY+year,+month+ORDER+BY+year+asc,+month+asc).

I suspect we need to be appending the Where clause there as well.

danrademacher commented 4 years ago

On closer inspection, seems more like we just need to get that citywide SQL to run again.

When I refresh the whole page, I have two SQL requests, like this: image

If I clear the console and change the Vehicle Type filter, I get only one: image

So not surprisingly, no request = no citywide data. Narrows the problem a bit, but doesn't get us to a fix

clhenrick commented 4 years ago

Documenting Dan and I's Slack convo here: for some reason the logic for fetching the "reference" entity data (citywide, borough, or custom) lives in the ReferenceEntitySelect component which houses the dropdown UI for selecting the reference entity. Because this component is only rendered for the line chart ("trend" chart view), it will only fetch data when it is rendered. I think it would make sense to put this logic in the App component where the other data fetching logic lives. That way the data fetching logic for reference entities would persist throughout the various chart views. Basically it should be, "if reference (citywide or borough) doesn't exist, go fetch it" so that when the cache is cleared it will be repopulated.

clhenrick commented 4 years ago

Made a few more commits, the "compare" view should now be working now and all data fetching logic now resides in the main App component. I also checked the custom geography as well which seems to be working but could probably use some more QA.

gregallensworth commented 4 years ago

8a64dd8 -- The Map hyperlink at the top, now includes the vehicle-type URL params. These match the format used in https://github.com/GreenInfo-Network/nyc-crash-mapper/issues/104 and are confirmed to correctly load the Map with the same vehicle filter selected.

This is for the gda-vehtypefilter branch. Let's give it a test, and see whether #108 and https://github.com/GreenInfo-Network/nyc-crash-mapper/issues/104 are working, so they may be deployed and closed.

gregallensworth commented 4 years ago

Deployed.