openstate / open-cultuur-data

The back- and front-end code that powers the Open Cultuur Data API
http://opencultuurdata.nl/
28 stars 18 forks source link

Add a way to get a list of available sources from the api. See #74 #75

Closed breyten closed 9 years ago

breyten commented 10 years ago

Ugh, this is kind of a dup :P

Yay for Elasticsearch aggregations!

breyten commented 10 years ago

See #74

bartdegoede commented 10 years ago

Why don't you just use the combined index? The collection name is present there as well, so running the following aggregation only on ocd_combined_index should yield accurate counts:

{
    "aggs":  {
        "collections": {
            "terms": {
                "field": "meta.collection"
            }
        }
    },
    "size": 0
}

Also, documentation and tests are missing :-)

breyten commented 10 years ago
justinvw commented 10 years ago

Yeah, the source_id and collection are really essential pieces of information for this endpoint.

To prevent having to hit all of the indexes, we could use the same sub-aggreations trick you are using, but instead of the _index field use the meta.source_id field.

Shouldn't the size of the first aggregation be set to 0 in order to get all of the values (in case there are more than 10 collections)?

Some details about testing Flask apps can be found here: http://flask.pocoo.org/docs/0.10/testing/.

breyten commented 10 years ago

I had no idea source_id was actually in meta :P