security-force-monitor / sfm-proxy

Security Force Monitor CSV proxy
MIT License
5 stars 2 forks source link

A few questions about information sources #36

Open evz opened 7 years ago

evz commented 7 years ago

I'm starting to implement the endpoints that @jpmckinney documented over here and I am running into a couple things which I'm not really sure how to infer based upon the information I know about that's currently in the spreadsheets.

  1. In the /countries/:id/map/ endpoint, there's a reference to a root_id and root_name which I'm guessing is the organization at the root of whatever organizational tree the organization referenced is part of. As far as I know, all we have is a parent organization. Is the idea here that we recurse up that chain until we reach the top? Would it be sufficient for the moment to just show a parent organization?
  2. On the /event/:id/ endpoint, we are referencing organizations_nearby. It seems like this requires knowing where a given organization was on a given date. Does this refer to where a given organization is more or less stationed (given their Site or Area?) Also, what is "nearby"?
  3. The various Organization endpoints have a similar reference to events_nearby. What is "nearby" in this context?
  4. The /organizations/:id/ endpoint refers to commander_present and commanders_former. What's the best way of inferring this?
  5. Similar questions for the People endpoints: we have events_nearby, area_present and site_present. What does "nearby" mean and how do we know where a given person is at the moment (or even where they were most recently)?

I'm not sure if @tonysecurityforcemonitor or @tlongers would know any answers here but it'd be good to get some clarity here.

tonysecurityforcemonitor commented 7 years ago

I'm going to jump in, but @jpmckinney please correct anything that I'm getting wrong!

In the /countries/:id/map/ endpoint, there's a reference to a root_id and root_name which I'm guessing is the organization at the root of whatever organizational tree the organization referenced is part of. As far as I know, all we have is a parent organization. Is the idea here that we recurse up that chain until we reach the top? Would it be sufficient for the moment to just show a parent organization?

As I understand it the root_name is capturing the "highest level" of a organization:classification. So an Organization with the classifications of Mobile Police Force (Riot Police) ; Police would have a root_name of Police, since all Mobile Police Force organizations are also part of the Police. I'm not sure about the root_id though, as I don't see it in the Classes Google Sheet.

On the /event/:id/ endpoint, we are referencing organizations_nearby. It seems like this requires knowing where a given organization was on a given date. Does this refer to where a given organization is more or less stationed (given their Site or Area?) Also, what is "nearby"?

This was intended to be tied to the Site of an organization - and "nearby" was supposed to be within 35km. It would absolutely would require that the organization's Site was known on a given day. It was intended to be an easy way to "tag" units that might need additional investigation. But maybe this is really thorny to implement? (It also isn't making use of the Area of an organization... which in some cases could be more relevant). But I don't know if scrapping this makes things harder - especially on the UI side @sebastien

The various Organization endpoints have a similar reference to events_nearby. What is "nearby" in this context?

"Nearby" was supposed to be 35km.

The /organizations/:id/ endpoint refers to commander_present and commanders_former. What's the best way of inferring this?

After having a PersonMembership:role of Commander- the commander_present (which is really commander_most contemporary) would be the Person with the most contemporary PersonMembership:date_first_cited or PersonMembership:date_last_cited. The commanders_former would be everyone else with the matching PersonMembership:organization_id and PersonMembership:role of Commander.

Similar questions for the People endpoints: we have events_nearby, area_present and site_present. What does "nearby" mean and how do we know where a given person is at the moment (or even where they were most recently)?

I see events_nearby as tied to the person's PersonMembership:organization_id and PersonMembership:date_first_cited or PersonMembership:date_last_cited. So if the organization has an "event nearby" at some point in time when the Person was a member of that organization - then they have the same "event nearby". @jpmckinney are the area_present and site_present left over from the old data model where we had fields for a Person to show if they were based in a different place/area than the organization of which they are a member?

jpmckinney commented 7 years ago

Let me know if this answers your questions. To add to Tony's answer:

  1. All organizations have either the top-level army or top-level police organization as the root, based on which sheet they appear on. Organizations may not have a complete chain of parents to that root, however.
  2. See Tony's answer.
  3. See Tony's answer.
  4. Find all the memberships of that organization for which the role is 'Commander' and sort in reverse by date-first-cited (or, if null, date-last-cited). The first membership is the present commander, and all others are former.
  5. An organization's present site is the site with the most recent date-first-cited (or, if null, date-last-cited). Same for present area. For people, these fields are typically embedded in a membership, which gives you the organization.
evz commented 7 years ago

Thanks, yeah this all helps. I'll stew on how to get this all together. I might create some database views or something to make these queries a bit saner given the hyper-normalized structure that we're working with.

tonysecurityforcemonitor commented 7 years ago

@jpmckinney @evz

All organizations have either the top-level army or top-level police organization as the root, based on which sheet they appear on. Organizations may not have a complete chain of parents to that root, however.

The roots will need to be a bit more numerous - there are definitely organizations that fall outside of police or army.... I'm not sure how to best inform the root structure - or add to it as we get going.

tlongers commented 7 years ago

Re: @evz Q1 @tonysecurityforcemonitor jump in on this.

In practice, the great majority of data-points included in the Monitor will concern army or police units, so the idea of a "root" has some convenience for researcher and developer as:

  1. A simple way to impose some loose groupings so the researchers don't get totally lost; and,
  2. A facet to ensures command chains are less messy when visualised.

Generally, the official "root" of the security establishment is is the ultimate political actor in that political system (e.g. elected Prime Minister, President, despot etc), followed by layers of oversight structures and progressive diffusion of authority to various ministerial offices. We do include these in the research. These persons and organisations are common to all organisations, whether army, police or others: they include bodies like national security councils, advisory groups, committees and so on. These are as much part of the command chain as other data points, but it would be incorrect to apply any of the current "root" terms to them quite yet.

Beneath those as we travel down the chain of command, the major branches of the security establishment begin to materialise: "navy" or "air force" appear. Other structures may the domestic law enforcement agencies, intelligence services and other parts of the security establishment also appear. It makes sense to give everything beneath this point in the hierarchy a label like "army" or "police".

Is there a case for creating a class of organisations that are actually 'root' ( like President, National Security Council, etc) and then including army, police and so on as 'branches'?

evz commented 7 years ago

@tlongers @tonysecurityforcemonitor I've gone ahead and just implemented a recursive query that recurses up the organizational tree that we have given a starting organization. This works mainly because we are storing the relationships as what amounts to an edge list in the composition table. This also works in the opposite direction by just reversing the way that we join in the composition table (and changing the starting point). For the moment, we have quite a lot of trees that are disjoint after the import process but as things get merged and cleaned up, this should give us a pretty effective way of building trees in either direction (top down or bottom up). Even with the state that things are in, I'm able to recurse up to the Nigerian Army Headquarters for a lot of the military units:

            name            |    id    | child_id |        child_name         
----------------------------+----------+----------+---------------------------
 1 Brigade (1)              | e1837972 |          | 
 1 Mechanised Division (1)  | 8533e736 | e1837972 | 1 Brigade (1)
 Nigerian Army Headquarters | b38db6c6 | 8533e736 | 1 Mechanised Division (1)

So, as long as the relationships are there, we should be able to create these mappings pretty easily.

tlongers commented 7 years ago

That's useful, thank you. Good QA step for the data, at least that which comes in from the spreadsheets.

evz commented 7 years ago

A few more questions about how to build out the API responses:

tonysecurityforcemonitor commented 7 years ago

In the /countries/:id/map response, one of the fields that can be filtered on is classification. Since the response includes both organizations and events (and it seems like we are referring to Event Types as "classifications" in the event responses) does that classification filter refer to the organization's classification or to the event classification (which, under the hood is the event type)?

That should be the Organizaiton's classification.

One of the common query parameters in the search API is p which refers to a page number. Which I guess indicates that the response should be paginated. Any preference as to how many results per page? Should the facets include counts for all results or just those on that page?

Yes the results should be paginated - I think 20 per page is a good number. I think as it is currently the counts should be for all results - screenshot example of current UI functioning on this:

image

One of the query parameters for the people search is classification. I'm not really sure what that's supposed to refer to since as far as I know there is no classification assigned to a person.

Not sure how to answer that. Currently the only two filters for a people search are Role and Rank. If we were to have it I would say would use the organization classification here - it would a good way to sort them by the classification of the organizations of which they are/were a member.

Also for the people search, you can filter based upon geonames_id? Should I just use the one associated with the organizations that they are members of?

Yes absolutely use the geonames_id associated with the organizations they are a member of.