bloom-housing / bloom

Bloom is Exygy’s affordable housing platform. Bloom's goal is to be a single entry point for affordable housing seekers and a hub for application and listing management for developers.
https://bloomhousing.com
Apache License 2.0
33 stars 26 forks source link

Listings - Model Design #2375

Closed pbn4 closed 2 years ago

pbn4 commented 2 years ago

What is this feature or what feature is this part of? Backend Describe the solution you'd like I would like us to reconsider what we already have in Listings module, because in my opinion in becomes more and more complex to edit / extend. And the root cause I believe to be:

And my proposition is to:

In my opinion this simplifies thinking about code a lot because we can again rely on Typescript and on what's actually in the Listing model is what we have received from the DB.

Since there will be less models to join (because some of them will be moved out of Listing's context) performance issues are not necessarily again a problem even with TypeORM.

Those changes will affect /listings API because e.g. AmiChart is a completely separate model fetched through a deeply nested relation Listing.Property.Unit.AmiChart and if we move forward with those changes AmiChart should be fetched and modified through /ami-charts. Same thing for Jurisdiction, sometimes we fetch a name, for the frontend to make it easy to retrieve, but we shouldn't be doing that. Frontend should fetch it separately from the backend using /jurisdictions endpoint. Frontend right now is SSR so the burden of making many requests is not e.g. on users mobile device and cellular network. Which is a good thing for us, but if we ever needed a more well suited endpoint for mobile we should do this through API gateway or backend-for-frontend pattern where there is a dedicated e.g. graphql server doing this requests joining close to the backend and only then sending the response.

seanmalbert commented 2 years ago

Hey @pbn4 ,

Thank you for putting this together, can you clarify what you mean by move querybuilders joining logic out of service layer and make sure particular model is always fully joined (move them into repositories)?

It seems that we could do this in smaller steps, like first make updates to fetch ami-charts, jurisdictions, etc. separately and then start trimming down the model.

seanmalbert commented 2 years ago

From our conversation, we're good to move forward with a POC, where we want to have repositories that handle fully joined listings and listings with no joins. From there we can move towards stripping out joins from the fully joined listing and move them to separate calls to improve performance.

pbn4 commented 2 years ago

Performance tests results:

  1. dev branch with query containing view = base
➜  ~ ab -k -c 10 -n 100 localhost:3100/listings?view=base                         [02/15/22|1:41:50]
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done

Server Software:
Server Hostname:        localhost
Server Port:            3100

Document Path:          /listings?view=base
Document Length:        43754 bytes

Concurrency Level:      10
Time taken for tests:   5.100 seconds
Complete requests:      100
Failed requests:        0
Keep-Alive requests:    100
Total transferred:      4402600 bytes
HTML transferred:       4375400 bytes
Requests per second:    19.61 [#/sec] (mean)
Time per request:       509.987 [ms] (mean)
Time per request:       50.999 [ms] (mean, across all concurrent requests)
Transfer rate:          843.04 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:   113  484  84.1    489     767
Waiting:      113  484  84.1    489     767
Total:        113  484  84.1    489     767

Percentage of the requests served within a certain time (ms)
  50%    489
  66%    503
  75%    519
  80%    530
  90%    560
  95%    587
  98%    745
  99%    767
 100%    767 (longest request)
  1. refactor/listings-model branch with always fully joined model
➜  core git:(refactor/listings-model) ✗ ab -k -c 10 -n 100 localhost:3100/listings
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient).....done

Server Software:
Server Hostname:        localhost
Server Port:            3100

Document Path:          /listings
Document Length:        212673 bytes

Concurrency Level:      10
Time taken for tests:   19.393 seconds
Complete requests:      100
Failed requests:        0
Keep-Alive requests:    100
Total transferred:      21294700 bytes
HTML transferred:       21267300 bytes
Requests per second:    5.16 [#/sec] (mean)
Time per request:       1939.297 [ms] (mean)
Time per request:       193.930 [ms] (mean, across all concurrent requests)
Transfer rate:          1072.33 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       0
Processing:   433 1827 299.4   1858    2489
Waiting:      433 1827 299.5   1858    2489
Total:        433 1827 299.4   1858    2489

Percentage of the requests served within a certain time (ms)
  50%   1858
  66%   1975
  75%   2018
  80%   2036
  90%   2154
  95%   2272
  98%   2441
  99%   2489
 100%   2489 (longest request)

Pros:

Cons:

Notes:

seanmalbert commented 2 years ago

@pbn4 , Thank you for running these benchmarks. I think this highlights that we can't fully move away from the concept that "views" introduced, a way for us to select different data for different contexts. We already know that flattening property doesn't seem to give us a performance improvement. We can discuss the address fields, but that alone won't get us down to acceptable limits with the new design. I am thinking about a form of compromise, where we implement your design and have a separate repository, query builder, dto etc to handle the slimmed down set of data needed for the list query. This way we can have a a full data list, which you already have and partial data list that are each fully typed. The new list won't represent the entire schema of what's in the database, but at least can be a partial from the full model and we can easily see what's been omitted.

This way we can continue to improve on performance and ease of development, taking the best from both worlds. What do you think?

pbn4 commented 2 years ago

@seanmalbert I agree with the compromise, it would be great if we could bump Typescript to at least 4.1 to utilize literal template string when doing this, so that we could give those selects and joins some typing.

seanmalbert commented 2 years ago

@pbn4 , TypeScript is finally up-to-date.

kathyccheng commented 2 years ago

Closing