Closed montxo5 closed 10 years ago
You can't install individual plugins, but this extension is basically centred around the DCAT harvester so you want to install the whole lot anyway.
I've added some install instructions:
Thank you. I've installed it, and configure a Harvester for a XML RDF, but in the gather_consumer.log it shows this error: ERROR [ckanext.harvest.queue] No harvester could be found for source type dcat_xml
It seems that the queue can't find the harvester for RDF_XML.
Sorry, my fault. I didn't restart the supervisor... Now the JSON harvester is working, but the XML is always crashing with this error in the fetch_consumer: ValueError: The provided document does not seem to contain a dcat:Dataset element
I've also tried with the example files. Thanks in advance.
Full trace:
File "/usr/lib/ckan/default/bin/paster", line 9, in
@montxo5 this looks like a bug in the XML parsing. I'll try and push the fix in the next couple of days
@montxo5 can you see if the latest changes in d289c5871 fix the issue?
Thank you very much! Now its working perfect with your example.
I'm trying with other DCAT from an Open Data Portal of Madrid. The import for datasets works fine, but with the resources it ignoring it. Its only creating empty datasets.
The RDF is here: http://datos.madrid.es/egob/catalogo.rdf Maybe the RDF they publish it's not correct, could it be? Thanks.
Hi @montxo5.
In DCAT land, the distributions are defined using the dcat:Distribution class. So for example, if you are using XML/RDF:
<dcat:distribution>
<dcat:Distribution>
<dct:title xml:lang="es">Consultas ciudadanas (2004-2013)</dct:title>
<!-- ... -->
</dcat:Distribution>
</dcat:distribution>
Note that the Madrid portal is using the dcat:Download
class, which AFAICT does not exist:
<dcat:distribution>
<dcat:Download>
<dct:title xml:lang="es">Consultas ciudadanas (2004-2013)</dct:title>
<!-- ... -->
</dcat:Download>
</dcat:distribution>
We followed the recommendations of the DCAT Application Profile for Data Portals in Europe as basis for our support for harvesting DCAT based documents, in case you want to have a reference.
Also, check the examples
folder of this extension to see the serializations supported.
Hope this helps.
Thank you very much! you're right. I will try to concact with Madrid's Open portal to explain it. For your information, we're trying to use this extension for a BigOpenPlatform to use it in a datathon event with Madrid's city hall called MADdata. If you are interested, or if you know someone, please check this page: http://maddata.es/ If we finally use this extension, we will mention it in the presentation. Thanks.
That looks great @montxo5, hope it's a good one in Madrid!
Closing the issue now
I'm very interested in this Extension, but specialy in the DCAT Harvester. Would it be possible to only install this function, and how can I do it? Thanks in advance.