Closed tdipisa closed 6 years ago
Could you please verify the harvesting of Comune di Palermo? It seems they fixed the issue. We need to verify that. If the issue is still present, please let me know that I will contact and work with Palermo to resolve it. Thanks a lot!
@giorgialodi description updated
Guys, once we have done the administrations with high priority we need to proceed with those that are in the task force DatiPubblici since we need to import their datasets in the DAF through the national catalogue. As soon as these tests are ok, I will indicate you those next
Università di Bologna: errori nel parsing. vedi #163
@etj as for MIUR; is it USTAT, right? The other URL is not working, correct?
if the other URL from MIUR is not working, try with the following one http://dati.istruzione.it/opendata/CatalogoRDF (the organization is still MIUR)
@etj do you have errors in "Comune di Milano" harvesting? I see there are 277 datasets; however they have 293 datasets in their catalogue (https://dati.comune.milano.it/dataset)
@giorgialodi I found these errors for Milano:
{'Name': 'That URL is already in use.'}
{'Temporal coverage': 'Invalid date input: 22015-03-11'}
@etj could you please point out the datasets for which these errors are raised? In this way we can communicate that to Milano. May I extract by myself this information without bothering you further? Should I register to your testing instance?
@etj we are analysing the erros "URL already in use" for Comune di Milano. It seems that the error occurs in the presence of datasets with same titles. However, there should be a mechanism that adds additional characters to the URL when this happens. How come that it does not work? The catalogue of Comune di Milano manages that case.
@etj @tdipisa I was looking at the sources. There is an error for MIUR catalogue. Any hints? Their file seems fine to me, at least from a DCAT-AP_IT perspective.
@giorgialodi the mime type returned by the MIUR service is wrong (text/plain), so the parser will not recognize it.
New organizations and sources have been added to the docker image https://github.com/geosolutions-it/dati-ckan-docker/issues/16
@etj thanks a lot. I will inform MIUR about that. There is still the doubt (a few comments above) about "URL already in use" errors for Comune di Milano (which is also an error that occurred for Regione Toscana, doesn't it?). Could you please help me in understanding better how to cope with that type of error. We are currently in contact with Comune di Milano that told us they've chosen to use same titles for datasets that are similar.
@giorgialodi, a new issue (the #168) has been created to keep track of this and provide an improvement if you want. In Ckan there can not be two sets of data with the same Id (and the Id is generated by the name of the harvested dataset). If you agree I will add this improvement to the task list and provide an estimate.
New Harvest sources must be tested:
[x] Comune di Milano: http://dati.comune.milano.it/catalog.rdf
[x] MIUR: http://dati.ustat.miur.it/catalog.rdf
MIUR: http://dati.istruzione.it/opendata/opendata/catalog (to try if we can harvest from this URL) (**returns an HTML page)[ ] MIUR: http://dati.istruzione.it/opendata/CatalogoRDF (Error:
Error parsing the RDF file: No plugin registered for (text/plain, <class 'rdflib.parser.Parser'>)
[x] Università di Bologna: https://dati.unibo.it/catalog.rdf
[x] Comune di Palermo: https://opendata.comune.palermo.it/dcat/dcat.php
[x] We need to create also an organization for each of the above
For current status and errors details refer to: https://docs.google.com/spreadsheets/d/1RhTOpO1VJTDvn8LdEYCnMvP6Fim21mNANR9D-f7Dkzs/edit#gid=0