Open mdehollander opened 7 years ago
As discussed with @pdurbin and @andrewSC on irc: http://irclog.iq.harvard.edu/dataverse/2017-08-23#i_56203
Here's a one-liner to list files by DOI using curl and jq:
curl https://demo.dataverse.org/api/datasets/:persistentId?persistentId=doi:10.5072/FK2/PVH0HO | jq '.data.latestVersion.files[].dataFile.filename' -r
Fritz1.JPG
fritz2.JPG
It seems that the xml that is returned does not contain any entry
tag, is it actually quite empty ;)
>>> dataverse = connection.get_dataverse('testing-journal-dataverses')
>>> dataverse.get_datasets()
https://demo.dataverse.org/dvn/api/data-deposit/v1.1/swordv2/collection/dataverse/testing-journal-dataverses
b'<feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Testing-journal-dataverses Dataverse</title><dataverseHasBeenReleased xmlns="http://purl.org/net/sword/terms/state">true</dataverseHasBeenReleased><generator uri="http://www.swordapp.org/" version="2.0"/></feed>'
[]
[]
Above I am printing the variables used in get_datasets
:
def get_datasets(self, refresh=False, timeout=None):
print(self.collection.get('href'))
collection_info = self.get_collection_info(refresh, timeout=timeout)
print(collection_info)
entries = get_elements(collection_info, tag='entry')
print(entries)
return [Dataset.from_dataverse(entry, self) for entry in entries]
Above is using the https://demo.dataverse.org/dvn/api/
url and returns xml, but when we use a direct call to https://demo.dataverse.org/api/datasets/:persistentId?persistentId=doi:10.5072/FK2/PVH0HO
json is returned. Has there been a major change of the API? So that the dvn in the url is not valid anymore and that json has replaced a xml output? @pdurbin, can you say something about this?
@rliebz, since you did most of the work for this python api client, do you have time to get things working again?
Looking at the docs at http://guides.dataverse.org/en/latest/api/intro.html I realize that the python client is using the SWORD API that uses XML and that the link we were using to get the files is the Native API. What would be the recommended API for retrieving information (not depositing), SWORD or the Native API?
@mdehollander I likely won't have time in the near future to do any debugging/fixing here—it's been a couple years since I've done any Dataverse work.
As for the recommended API, the client was mostly written back when SWORD was the only option, but I would recommend sticking to the native API wherever possible. Ideally, this project would migrate over to the native API completely (I think there's a little bit of native functionality already in here), but the code is pretty tightly coupled to the structure of the XML that the SWORD API uses, so it might be a bit of a challenge.
@rliebz, thanks for letting us know and giving more information the the choice of APIs.
There has already been a first attempt 3 years ago to make a python client using the native api: https://github.com/astrofrog/pyverse. It seems to be working but has not got all functionality.
I see if I can get it working for my use case, and it would be great if there are others in the community who would like to contribute to it as well.
Interesting. I see @astrofrog and I talked about pyverse at http://irclog.iq.harvard.edu/dataverse/2015-04-01#i_17806 but I completely forgot about it! @mdehollander if that's a good starting point, I say go for it.
I am experiencing this issue when using the python API:
A Dataverse object is obtained, but the
get_dataset_by_doi
call is empty. Directly using an API request athttps://demo.dataverse.org/api/datasets/:persistentId?persistentId=doi:10.5072/FK2/PVH0HO
gives an OK json output.