atviriduomenys / spinta

Spinta is a framework to describe, extract and publish data (a DEP Framework).
MIT License
11 stars 4 forks source link

Add inspect for CSV files #87

Open sirex opened 3 years ago

sirex commented 3 years ago

Add inspect support for csv files, when CSV files are in a ZIP archive:

$ spinta inspect -r csv https://example.com/data.zip -p 'extract("zip")'

As described in the docs.

Test with data from Regitra.

CSV support for inspect doesn't work when we run the command with the csv file also:

spinta inspect -r csv miestai.csv -o miestai.xlsx gives the following error: TabularManifestError: miestai.csv:1: Unknown columns: pavadinimas, plotas, gyventojų skaičius, šalis.

More detailed traceback and example data can be found in the comments.

sirex commented 3 years ago

marked this issue as related to manifest#27

sirex commented 3 years ago

marked this issue as related to manifest#135

sirex commented 3 years ago

marked this issue as related to manifest#28

sirex commented 3 years ago

removed the relation with manifest#28

sirex commented 3 years ago

marked this issue as related to manifest#28

karina-klinkeviciute commented 3 months ago

File "spinta/cli/inspect.py", line 58, in inspect context, manifest = create_manifest_from_inspect( File "spinta/datasets/inspect/helpers.py", line 81, in create_manifest_from_inspect _merge(context, manifest, manifest, resource, has_manifest_priority, dataset) File "spinta/datasets/inspect/helpers.py", line 103, in _merge store = load_manifest(context, full_load=True) File "spinta/cli/helpers/store.py", line 120, in load_manifest commands.load( File "spinta/manifests/yaml/commands/load.py", line 110, in load commands.load( File "spinta/manifests/tabular/commands/load.py", line 59, in load load_manifest_nodes(context, into, schemas, source=manifest) File "spinta/manifests/helpers.py", line 133, in load_manifest_nodes for eid, schema in schemas: File "spinta/manifests/tabular/helpers.py", line 1631, in read_tabular_manifest yield from _read_tabular_manifest_rows( File "spinta/manifests/tabular/helpers.py", line 1569, in _read_tabular_manifest_rows header = _detect_header(path, 1, header) File "spinta/manifests/tabular/helpers.py", line 122, in _detect_header raise TabularManifestError( spinta.manifests.tabular.helpers.TabularManifestError: miestai.csv:1: Unknown columns: pavadinimas, plotas, gyventojų skaičius, šalis.

karina-klinkeviciute commented 3 months ago

Contents of the CSV file:

Pavadinimas,plotas,gyventojų skaičius,šalis
Kaunas,40 h,200000,Lietuva
Vilnius,60 h,400000,Lietuva