PhilanthropyDataCommons / service

A project for collecting and serving public information associated with grant applications
GNU Affero General Public License v3.0
8 stars 2 forks source link

Include field values in `/organizations/{organizationId}` #1129

Open bickelj opened 1 month ago

bickelj commented 1 month ago

The idea is to enhance the existing /organizations endpoint with by finding all the values associated with that organization, and return the best values.

In order to think best about the ways to do a thing, it helps to try to do it.

Issue #1087

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 88.26%. Comparing base (0d4fb85) to head (8fbdf58). Report is 27 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1129 +/- ## ========================================== + Coverage 88.24% 88.26% +0.01% ========================================== Files 135 135 Lines 1829 1832 +3 Branches 237 237 ========================================== + Hits 1614 1617 +3 Misses 198 198 Partials 17 17 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

slifty commented 1 month ago

What do you think of not adding a new endpoint but rather having that detail become part of the organization object on the existing /organizations endpoint. I don't think we need (or want) a new type of organization entity.

Similar to how Proposal has its detailed values provided directly and we don't have both a Proposal and a ProposalDetail object type.

bickelj commented 1 month ago

What do you think of not adding a new endpoint but rather having that detail become part of the organization object on the existing /organizations endpoint. I don't think we need (or want) a new type of organization entity.

Similar to how Proposal has its detailed values provided directly and we don't have both a Proposal and a ProposalDetail object type.

I think it makes sense to bring the existing endpoint and the details endpoint closer together and perhaps even merge them. There are still reasons to keep the endpoints separate and even if the endpoints merge to keep some types of these objects separate, namely to make the dynamic part of the response optional via parameter.

  1. Authorization. We want /organizations to be open to the public but the gold data not open. Also, different users will get different data based on their permissions. If we want to merge the expectations/requirements about openness then I think the endpoints could be merged. But I'd still probably prefer some type separation such that the one response not depending on authorization is the same for all users.
  2. Utility. It is useful to distinguish the shallow organization from the deep organization. If, for example, I am posting data to the PDC, I should check if an organization exists before associating proposal data with the organization. In order to check if it exists, I need to get the existing organization and/or create it. When doing that, I am not interested in searching through all proposal field values based on my authorization, the time of day, and fine-grained permissions. In fact I am interested in not doing all that because it is going to slow my upload down significantly. This is not only a theoretical concern (i.e. not needing experimentation to know that it will be slower, which is true) but an experiment already happened in the past. Running scripts to post new proposals ran way slower when a shallow body was replaced with a deep body without the option to get the shallow one.
  3. Time. Gold data may change from request to request (due to time being involved in the filtering, sorting, extraction of data) even when the narrow organizations entity does not change at all. One (gold data) is a relatively dynamic resource and the other (shallow org) a relatively static resource.

Reason 1 (authorization) is new with fine-grained permission and I suppose it applies to proposals as well. Reason 2 (utility) applied and applies to proposals as well. Reason 3 (time) I suppose only applies to gold data for organizations.

bickelj commented 1 month ago

@slifty See also related discussion in the issue at https://github.com/PhilanthropyDataCommons/service/issues/1087#issuecomment-2263883962 and following.

bickelj commented 3 weeks ago

Feedback just now from @slifty: do the deep integration/merge, down to the query level. When selecting data, include the parameter of authorization to the One True Organizations query.

bickelj commented 4 days ago

Odd, I got this locally too when running npm run test:ci but not when running that test independently, e.g. node 'node_modules/jest/bin/jest.js' 'src/__tests__/organizations.int.test.ts' -c 'jest.config.int.js' -t '/organizations POST / requires authentication':

FAIL src/__tests__/organizations.int.test.ts
  ● /organizations › POST / › requires authentication

    expected 401 "Unauthorized", got 400 "Bad Request"

      389 |     describe('POST /', () => {
      390 |         it('requires authentication', async () => {
    > 391 |             await agent.post('/organizations').expect(401);
          |                                                ^
      392 |         });

It seems to happen when running the whole suite but not when running a narrower amount of tests in the suite. Test interaction?

bickelj commented 7 hours ago

The above test failure happens when my new test is introduced but not when it is not introduced. Commenting out all tests except the existing POST where 401 is expected: test passes. Commenting out all tests except the same existing and my new: POST/401 test fails, line 22 of requireAuthentication.ts is reached (should be line 11). Order seems to matter as well. When my new test precedes the existing POST/401 test, the existing test fails. When my new test follows the existing POST/401 test, the existing test passes. So there may be something from my new prod code or test code causing the issue.

I see the bearer token and an auth property on req object when it shouldn't be there. Perhaps the same request is being re-used by supertest. I see we re-use the agent whereas the supertest examples usually make a new call to request or agent where a new test is needed. Ah, there is a specific example of intentional test interaction whereby a request gets re-used that shows a const agent = like we have. Search the examples page for "an example with mocha that shows how to persist a request and its cookies" to see it.

So yeah, if we don't want to re-use the same request, I think we need to make a new call to supertest (agent or request).