GSA / code-gov-api

API powering the code.gov source code harvester
http://code.gov
Other
53 stars 28 forks source link

Add endpoint for exposing the code.json file #202

Open IanLee1521 opened 6 years ago

IanLee1521 commented 6 years ago

It would be great to add an endpoint to the REST API to enable an end user to get the full code.json file for all the federal agencies. That would allow comparisons against the data from agencies.

For instance, I am interesting in comparing Federal wide information to the data at DOECode. Additionally, I have am investigating other integrations with, for instance, GitHub.com/llnl/scraper

froi commented 6 years ago

@IanLee1521 this already exists. The endpoint /status/{agencyName}/fetched returns the code.json that was fetched in our last harvest.

You can try it out here. You do not need an API key to view this endpoint.

I will admit that the endpoint and the description in the swagger docs aren't descriptive enough to let you know their functionality. I'll keep this issue open and work on making them better.

froi commented 6 years ago

I'll add another thing to this. The endpoint signature should say {agencyAcronym not {agencyName}.

Eg. curl -X GET https://api.code.gov/status/GSA/fetched

I've taken note of it and will be fixing this as well.

IanLee1521 commented 6 years ago

Thanks @froi -- Is there (could there be) a similar endpoint for pulling the / a combined code.json file?

froi commented 6 years ago

There could be but it isn't on our roadmap. We used something similar to that for our front-end when it wasn't using the API. You can take a look at an example here. Keep in mind that this file is 4 months out of date.

The file was created by the code-gov-harvester. You could run it yourself and reproduce the file if you think it might be useful to you.

IanLee1521 commented 6 years ago

Got it. Perhaps that would be a good feature request to add into https://github.com/llnl/scraper to handle creating that...?

As an aside, do you have a static list of the agencyAcronym's that you're using?

froi commented 6 years ago

As an aside, do you have a static list of the agencyAcronym's that you're using?

Yes, agency_metadata.json. We follow whatever acronym the agencies are using in their code.json.

Eg.

{
    "id": 1,
    "name": "Department of Agriculture",
    "acronym": "USDA",
    "website": "https://usda.gov/",
    "codeUrl": "https://www.usda.gov/code.json",
    "fallback_file": "USDA.json",
    "requirements": {
      "agencyWidePolicy": 0.75,
      "openSourceRequirement": 0,
      "inventoryRequirement": 0,
      "schemaFormat": 0.5,
      "overallCompliance": 0
    },
    "complianceDashboard": true
}

Perhaps that would be a good feature request to add into https://github.com/llnl/scraper to handle creating that...?

I don't see why not.

Something to keep in mind is that our inventory includes non-open source projects as well. So if this feature is added to your project it would be awesome but would only include part of the agencies' inventory, unless you also use their code.json files. Does that make sense?