DanePubliczneGovPl / ckanext-danepubliczne

Layout and custom fields for DanePubliczne.gov.pl
Other
10 stars 4 forks source link

Rozpoznawanie otwartości RDF/n-triples #284

Open KrzysztofMadejski opened 7 years ago

KrzysztofMadejski commented 7 years ago

Przykład: https://danepubliczne.gov.pl/dataset/obiekty-wpisane-na-liste-swiatowego-dziedzictwa-unesco/resource/fb4b39e6-aeff-4fa4-a401-04b6afb8b5cf

Dodać rozpoznawanie do ckan-qa. Ref #275

KrzysztofMadejski commented 7 years ago

z nowym ckanext-qa: nie ma wsparcia dla n-triples

Przetwarzanie: https://danepubliczne.gov.pl/dataset/3791090e-24a6-4566-98e2-b06a822af46b/resource/fb4b39e6-aeff-4fa4-a401-04b6afb8b5cf/download/ObiektyUnesco.nt

[2017-03-15 10:57:26,541: INFO/PoolWorker-1] archiver.update_package[rdf-n-triples/95dd]: Starting update_package task: package_id=u'3cbdc2a9-69c2-4fab-83bd-f53c47468a6f' queue=priority
[2017-03-15 10:57:26,638: INFO/PoolWorker-1] archiver.update_package[rdf-n-triples/95dd]: Attempting to download resource: https://danepubliczne.gov.pl/dataset/3791090e-24a6-4566-98e2-b06a822af46b/resource/fb4b39e6-aeff-4fa4-a401-04b6afb8b5cf/download/ObiektyUnesco.nt
[2017-03-15 10:57:26,642: INFO/PoolWorker-1] Starting new HTTPS connection (1): danepubliczne.gov.pl
[2017-03-15 10:57:26,842: INFO/PoolWorker-1] archiver.update_resource[rdf-n-triples/95dd]: GET started successfully. Content headers: {'transfer-encoding': 'chunked', 'accept-ranges': 'bytes', 'server': 'Apache/2.4.7 (Ubuntu)', 'last-modified': 'Tue, 10 Jan 2017 14:04:05 GMT', 'content-range': 'bytes 0-937556/937557', 'etag': '"1484057045.11-937557"', 'pragma': 'no-cache', 'cache-control': 'no-cache', 'date': 'Wed, 15 Mar 2017 09:54:27 GMT', 'content-type': 'application/octet-stream'}

[2017-03-15 10:57:27,750: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Openness scoring package rdf-n-triples (1 resources)
[2017-03-15 10:57:27,755: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Sniffing file format of: /home/ckan/data/archiver/9c/9c84b918-7248-450a-b222-01e8d4afcdc6/ObiektyUnesco.nt
[2017-03-15 10:57:27,764: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Magic detects file as: text/plain
[2017-03-15 10:57:27,764: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Mimetype translates to filetype: TXT
[2017-03-15 10:57:27,766: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Not JSON - 0 matches
[2017-03-15 10:57:27,857: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Is CSV because 12.1 cells per row (217 cells, 18 rows)
[2017-03-15 10:57:27,857: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Score: 3 Reason: Content of file appeared to be format "CSV" which receives openness score: 3.
[2017-03-15 10:57:27,857: INFO/PoolWorker-1] qa.update_package[rdf-n-triples-4b0a]: Openness scoring: 
{'openness_score_reason': 'Content of file appeared to be format "CSV" which receives openness score: 3.', 'openness_score': 3, 'archival_timestamp': '2017-03-15T10:57:27.264783', 'format': 'CSV'}
<Resource id=9c84b918-7248-450a-b222-01e8d4afcdc6 package_id=3cbdc2a9-69c2-4fab-83bd-f53c47468a6f url=https://danepubliczne.gov.pl/dataset/3791090e-24a6-4566-98e2-b06a822af46b/resource/fb4b39e6-aeff-4fa4-a401-04b6afb8b5cf/download/ObiektyUnesco.nt format= description= hash= position=0 name=RDF/n-triples resource_type=ntriples mimetype=None mimetype_inner=None size=None created=2017-03-15 10:57:26.071495 last_modified=None cache_url=None cache_last_updated=None webstore_url=None webstore_last_updated=None url_type=None extras={} state=active revision_id=44728972-a343-407e-ba9c-706fa651b11b>
u'https://danepubliczne.gov.pl/dataset/3791090e-24a6-4566-98e2-b06a822af46b/resource/fb4b39e6-aeff-4fa4-a401-04b6afb8b5cf/download/ObiektyUnesco.nt'
KrzysztofMadejski commented 7 years ago

Do zaimplementowania