NYCPlanning / ceqr-app

Web Application supporting the City Environmental Quality Review (CEQR) process
https://www.ceqr.app
12 stars 6 forks source link

Refactor backend school buildings data model capacity and enroll fields #625

Open trbmcginnis opened 4 years ago

trbmcginnis commented 4 years ago

Currently in backend/app/models/public_schools_analysis.rb, if a school has an org_level property of "PSIS", we separate this one school into TWO schools, one for the PS version and one for the IS version. Both school objects will keep the same org_id. Each school object has these properties by default: ps_capacity, is_capacity, hs_capacity AND ps_enroll, is_enroll, hs_enroll.

The rails app currently assigns three new calculated properties to each school object:

ISSUE: Having both ps_capacity and capacity properties is confusing. We were thinking of removing the ps_capacity, ps_enroll, ms_capacity, ms_enroll etc. properties since in the frontend these properties aren't being used and are mainly just noise. We will still separate out mixed-level schools into two schools but each new object will only have level, capacity, and enroll, making the object much shorter and more direct.

How to implement: 1) Remove ps_capacity etc. properties from the ceqr_school_buildings private methods on the public_schools_analysis.rb backend model 2) Remove these values from the frontend public-schools-analysis.js factory

NOTE: After EDM combines Bluebook and LCGMS school datasets we will have one CEQR school buildings dataset. ps_capacity and ps_enroll will have different names with EDM's new table. ps_enroll and is_enroll will become pe and ie respectively and ps_capacity and is_capacity will become pc and ic respectively.

// example of derived "is" data object from an initial "PSIS" school object
    {
      org_id: 'K002',
      org_level: 'PSIS',
      level: 'is', // computed 
      enroll: 493, // computed
      capacity: 669, // computed 
      hs_enroll: 0, // REMOVE
      ms_enroll: 493, // REMOVE
      ps_enroll: 198, // REMOVE
      hs_capacity: 0, // REMOVE
      ms_capacity: 669, // REMOVE
      ps_capacity: 233, // REMOVE
    },
// example of derived "ps" data object from the same "PSIS" original school object
    {
      org_id: 'K002',
      org_level: 'PSIS',
      level: 'ps', // computed 
      enroll: 198, // computed
      capacity: 233, // computed 
      hs_enroll: 0, // REMOVE
      ms_enroll: 493, // REMOVE
      ps_enroll: 198, // REMOVE
      hs_capacity: 0, // REMOVE
      ms_capacity: 669, // REMOVE
      ps_capacity: 233, // REMOVE
    },

Related to #484

bfreeds commented 4 years ago

One challenge with this data model is that an object does not clearly represent one single thing.

The object is intended to represent one instance of a school (breaking down school building objects that have a combined ps and ms school within them.

As it stands, the data model includes original data about the school building itself (ps, ms, hs data), however what the rails backend does is try to represent a single school unit within that combined building.

If we wanted to clearly make this data object represent the school unit, it would only include these properties:

    {
      org_id: 'K002',
      level: 'is', // computed 
      enroll: 493, // computed
      capacity: 669, // computed 
    }

Here, the data object is clearly defined to represent a single real-world entity.

Alternatively, if we maintained the data structure we get from the original datasets, but wanted to clearly represent it, it could look like this:

    {
      org_id: 'K002',
      org_level: 'PSIS',
      hs_enroll: 0,
      ms_enroll: 493,
      ps_enroll: 198,
      hs_capacity: 0,
      ms_capacity: 669,
      ps_capacity: 233,
    }

This would remove our computed / duplicative information in the data model, but would require us to refactor to have the application perform the multiplier calculations downstream (maybe near the frontend closer to where the multiplier calculations are performed.

bfreeds commented 4 years ago

The compromise @trbmcginnis and I discussed was to maintain the existing data model, which wouldn't require a large refactor, but rename the calculated fields to make it more clear what they mean (in relation to the surrounding, redundant properties).

We could just remove the unused ps_capacity, etc. properties, but thought that it might be helpful for debugging/being able to see where the calculated capacity or enroll values are coming from.

trbmcginnis commented 4 years ago

schools_capacity

trbmcginnis commented 4 years ago

The only places in the frontend app where we use ps_capacity etc. properties is when referring to values for SCA projects.

^^ this means that all ps_capacity, ms_capacity, ps_enroll, and ms_enroll properties are just noise.