focusconsulting / housing-insights

Bringing open data to affordable housing decision makers in Washington DC. A D3/Javascript based website to visualize data related to affordable housing in Washington DC. Data processing with Python.
http://housinginsights.org
MIT License
58 stars 110 forks source link

Enhanced filtering for one-to-many sources #500

Open NealHumphrey opened 7 years ago

NealHumphrey commented 7 years ago

Currently the in-browser filtering for projects that match the data criteria is done on a table that has only one one-to-many relationship, that of project->subsidy. Currently, multiple duplicate rows of data are created for each subsidy in the /filter api endpoint. This means the filters apply to one flat table of records.

There are a couple additional one-to-many record types that we will want to filter on:

Address is the most important of these. However, it is not feasible to scale up our approach of duplicated rows as the number of rows and amount of data to filter will grow exponentially with each added one-to-many data source. We need to implement a new solution for filtering with one-to-many relationships.

Options: 1) Keep the current structure but add nested JSON for address, topa, and subsidy. When filters apply to a field that includes nested json, traverse the nested data.

2) Use separate arrays for each 'many' table and filter each one separately to a list of matching nlihc_ids for that data source. Then find the intersection of the resulting lists of nlihc_ids.

Other ideas?

NealHumphrey commented 7 years ago

I added #510 to supply us with the nested structure needed for solution 1

NealHumphrey commented 7 years ago

See comment in #510 - we can eliminate the filterData in the datacollection and just use raw_project directly if we have nested json and the filter code to handle it.