source_id for Providers or Collections

openstate / open-cultuur-data

The back- and front-end code that powers the Open Cultuur Data API

http://opencultuurdata.nl/

28 stars 18 forks source link

source_id for Providers or Collections #56

Closed deepakbhatia closed 10 years ago

deepakbhatia commented 10 years ago

Can the source_id be provided in the Data sets section of the documentation. Not all collections have source_id that is a simple concatenation of the entire name.

breyten commented 10 years ago

That would be a good addition I guess. In the mean time, see the json

deepakbhatia commented 10 years ago

Thanks. The amsterdammuseum, uukaarten, erfgoed_leiden_beeldbank id do not work. screen shot 2014-08-12 at 18 01 27 I am using it as

curl -i -XPOST 'http://api.opencultuurdata.nl/v0/uukaarten/search' -d '{ "query": "utrecht", "size": 100 }'

breyten commented 10 years ago

Looks like we didn't do the crawling yet for these on the live server. We will do it as soon as possible :-)

nikkitimmermans commented 10 years ago

Does this - crawling of Amsterdam Museum dataset and crawling of University of Utrecht dataset - need 2 new separate issues?

breyten commented 10 years ago

Nope :-)

breyten commented 10 years ago

Fetching from these museums as we speak. It'll be a while before they are done.

nikkitimmermans commented 10 years ago

Fire it up ; )

breyten commented 10 years ago

Amsterdam Museum (92K items) and UUKaarten (693 items) are loaded now.

We still have an issue with the Erfgoed Leiden.

nikkitimmermans commented 10 years ago

The UUKaarten set doesn't appear in search.opencultuurdata.nl

coret commented 10 years ago

@breyten what's the problem with Beeldbank Leiden?

breyten commented 10 years ago

@coret This is in the logs:

[2014-08-18 11:50:23,436] [ocd_backend] [extractor] [opensearch] [DEBUG] - Getting http://www.archiefleiden.nl/api/opensearch/ (params: {'q': u'"*:*"', 'count': 0})

breyten commented 10 years ago

@nikkitimmermans Hmm, do you have a search query for me? ;) (According to the index, there's 693 items in)

breyten commented 10 years ago

@nikkitimmermans Ah, I've found the error. The metadata seems to think it's tiff format, but the image link that we construct makes it point to a jpg. ocd search only looks for jpeg.

nikkitimmermans commented 10 years ago

@breyten: example image for UUKaarten: Holãd

http://objects.library.uu.nl/reader/img.php?obj=1874-20395&mode=1&img=/83/02/20/83022044923267117588354220098573988813.jpg

http://objects.library.uu.nl/reader/index.php?obj=1874-20395&mode=1

Still doesn't appear in search.opencultuurdata.nl

breyten commented 10 years ago

@nikkitimmermans : that's what I'm saying -- we get the wrong field for the image format. If you go here : http://api.opencultuurdata.nl/v0/uukaarten/98a34462fb211ec5a491cb8313049c36c3f679c6/source you will see the source data we get. The dc:format fields says it's a tiff file. However, the link constructed above, is obviously a jpg file. So we should update the tramsformer for uukaarten, then delete the old items, and refetch.

breyten commented 10 years ago

See #57 for the uukaarten thing.

nikkitimmermans commented 10 years ago

aha i see, didn't know it had to be refetched, but makes sense

breyten commented 10 years ago

Ah, yes I should've been clearer about that. Sorry! :)

coret commented 10 years ago

@breyten does https://github.com/openstate/open-cultuur-data/commit/d0dfc14cbd5c088ddf4e487767b7be086af7c052 describe your issue (has a fix :-)