civictechindex / brigade-project-index

Brigade Project Index
https://brigade.cloud/
12 stars 19 forks source link

Review "tags" field in CfA's organizations.json #33

Open themightychris opened 4 years ago

themightychris commented 4 years ago

I just noticed that the organizations.json database CfA maintains (which drives both the index and CfA's internal tooling, and which CfA staff currently handle moderating brigade captain access to change) includes a tags field at the brigade level: https://github.com/codeforamerica/brigade-information

It is documented as such:


They appear to be applied pretty consistently in the data: https://github.com/codeforamerica/brigade-information/blob/master/organizations.json

Hack for LA, for example, has this entry:

        "tags": [
            "Brigade",
            "Code for America",
            "Code for America Fiscally Sponsored Brigade",
            "Official"
        ]

There are however a few odd values like midwest added in without sufficient qualifiers to be a global tag.

Some things this makes me wonder:

ckingbailey commented 4 years ago
* Do these represent tags that are being used substantially "in the wild" on GitHub already as project topics? Or is their use strictly internal to categorizing things within this list

Did some quick research on this. I didn't find anything for:

It doesn't appear these tags are being used widely in the wild. I think @tdooner could speak to this more accurately, but I think the purpose of these tags is for querying CfAPI, as in /api/organizations?tags[]=Code%20for%20America&tags[]=Brigade

  * If use of these as GitHub topics in the wild is sparse, would we just snake-case all the tags in here and use it as a mapping in the index. It seems like many of them wouldn't be things we'd want tagged on github projects

snake_case, or kebab-case? I'd vote for kebab-case to be consistent with our GH topic tagging recommendations, unless there's a reason the index prefers underscores to hyphens. I'd rather not have to remember that it's snake_case here, kebab-case there, and camelCase somewhere else.

* Should we add another field to organizations.json for "synonymous tags", so that at the same time a brigade declares that "hack-for-la" is their official projects tag, they might also expressly document that "code-for-america" and "civic-hacking" are some separate `recommended_project_tags`? Or should we just whitelist some of the existing tags in organizations.json like "Code for All" and "Code for America" and infer that `code-for-all` and `code-for-america` should be applied too to all projects tagged to that brigade?

I like the idea of whitelisting an official version for the ones that will be used across many brigades, and leaving them out of organizations.json. I'd still like to see a field on organizations.json to store a brigade's official topic tag, like openoakland.

tdooner commented 4 years ago

@ckingbailey Yes, the tags are meant for filtering of the list of Brigades. As far as I know, the only usage in the wild is the Brigade website:

https://github.com/codeforamerica/brigade/blob/master/cfapi/__init__.py#L84

I don't see a benefit to trying to represent these tags on Github repos, since they're different. The tags we're talking about for the project index are for tagging projects, whereas the tags in the CFAPI are for the brigades themselves.

themightychris commented 4 years ago

thanks for clearing that up @tdooner, these tags would be accurate at least for determining which organizations can be rolled up under code-for-all and code-for-america, right?

I guess what I'm really seeking to figure out is, as we start collecting the brigade tag for projects (e.g. projects_tag: "openoakland"), what other information from organizations.json would be accurate/helpful for projects to assume by relation? If this tags field isn't rigorously maintained I could see the answer being "nothing", but if it IS rigorously maintained... I think it would make sense at least for example for a search in the index for all projects matching code-for-america to also include all projects tagged openoakland because the Open Oakland entry in organizations.json is tagged "Code for America".

I could see it being useful for the crawler to apply these "inferred" tags while building the index. That is, tools using the index wouldn't need to know to include openoakland whenever someone wants to filter by code-for-america, because the index might just have the code-for-america or code-for-all tags automatically inserted already for you

@ckingbailey thanks for surveying their use, sounds like we don't need to consider projects already using them so the question is just what can we infer from them?

  1. it seems like code-for-all / code-for-america could be inferred to projects based on these tags + projects_tag
  2. it seems like region tags (I saw midwest in there a couple times) weren't rigorously maintained and aren't usable
  3. it seems like "Brigade", "Government", and "Official" are rigorously maintained. Are there tags these should surface as at the project level or are they just interesting bits/filters for the orgs list?
    • @ExperimentsInHonesty what do you think? do these align with any standard tags you're looking at?
tdooner commented 4 years ago

these tags would be accurate at least for determining which organizations can be rolled up under code-for-all and code-for-america, right?

I only maintain the tags for brigades tagged with "Code for America". The tags I maintain are: "Code for America Fiscally Sponsored Brigade", "Code for America Partner Brigade", and "Official". I don't maintain "Government".

what other information from organizations.json would be accurate/helpful for projects to assume by relation

The "Official" tag is relevant because I will remove it when Brigades are no longer part of the network. We probably want to de-list their projects from the index at that time. Other than that, idk, just the other metadata fields about the brigade could be useful in how we display the project (e.g. "Click here to go to the brigade's website!" or something like that). But that can probably be handled at a layer other than the index data layer.

it seems like region tags (I saw midwest in there a couple times) weren't rigorously maintained and aren't usable

Yeah, some people put those in there, and I didn't have the heart to remove them.

tdooner commented 4 years ago

I don't know how much you care about these tags @themightychris, but, you can also see how I maintain these tags. This is the script I have which reconciles the organizations.json file with Salesforce:

https://github.com/codeforamerica/brigade-information/blob/master/bin/merge-from-salesforce#L201-L212

I run this script once a week whenever there are changes to the brigade list.