lexibank / lexibank-analysed

Study on lexibank data (presenting the lexibank dataset).
Creative Commons Attribution 4.0 International
10 stars 3 forks source link

Zenodo download #27

Closed johenglisch closed 3 years ago

johenglisch commented 3 years ago

So, I got Zenodo downloads working – in theory. Two problems:

(also, why doesn't Github allow me to ping multiple people using the Reviewers system…?)

xrotwang commented 3 years ago

Limit on the number of reviewers is due to the repos being private.

xrotwang commented 3 years ago

Hm. Maybe we should

xrotwang commented 3 years ago

https://zenodo.org/oai2d?verb=ListRecords&set=user-lexibank&metadataPrefix=oai_dc has these elements:

<dc:relation>url:https://github.com/lexibank/papuanvoices/tree/v1.0</dc:relation>
xrotwang commented 3 years ago

And the version tag could be plugged into URLs of this form: https://github.com/lexibank/papuanvoices/archive/refs/tags/v1.0.zip

johenglisch commented 3 years ago

Yeah, that looks useful – I'll try that, thanks

LinguList commented 3 years ago

Thanks to all of you. I'll let you decide this, as these downloads are not my strong point, if you are okay with that.

johenglisch commented 3 years ago

Okay, works in theory. Unfortunately, only in theory again… Apparently, some datasets don't show up in the metadata round-up… (like allenbai; 10.5281/zenodo.5115649 for instance)

johenglisch commented 3 years ago

And I double-checked: allenbai is part of the Lexibank community. The DOI checks out. It just doesn't show up in the xml.

xrotwang commented 3 years ago

Ah, OAI PMH comes in batches. You have to check the resumption token, and provide it with the next request. See here: https://github.com/dlce-eva/zenodoclient/blob/35d20c68221cb30ef0a699be39d97a08130389b8/src/zenodoclient/oai.py#L96-L121

xrotwang commented 3 years ago

More specifically, here: https://github.com/dlce-eva/zenodoclient/blob/35d20c68221cb30ef0a699be39d97a08130389b8/src/zenodoclient/oai.py#L115-L120

johenglisch commented 3 years ago

I think this should be ready for merge now