frictionlessdata / datapackage

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.
https://datapackage.org
The Unlicense
481 stars 107 forks source link

bug: Cell value not interpreted as array #957

Open adrienDog opened 3 days ago

adrienDog commented 3 days ago

Hello!

Please if you can point me to where i can contribute and fix this, I would gladly contribute

Summary

When using describe

Im getting {'name': 'test_array', 'type': 'string'}

instead of {'name': 'test_array', 'type': 'array', 'arrayItems': {'type': 'string'}}

Description

Setup

For a given test.csv with this content:

id,name,test_array
1,foo,'["a", "b", "c"]'
2,bar,'["d", "e", "f"]'

and the following code

from pprint import pprint
from frictionless import describe

resource = describe('test.csv')
pprint(resource)

Actual

{'name': 'test',
 'type': 'table',
 'path': 'test.csv',
 'scheme': 'file',
 'format': 'csv',
 'mediatype': 'text/csv',
 'encoding': 'utf-8',
 'dialect': {'csv': {'skipInitialSpace': True}},
 'schema': {'fields': [{'name': 'id', 'type': 'integer'},
                       {'name': 'name', 'type': 'string'},
                       {'name': 'test_array', 'type': 'string'}]}}

Expected

{'name': 'test',
 'type': 'table',
 'path': 'test.csv',
 'scheme': 'file',
 'format': 'csv',
 'mediatype': 'text/csv',
 'encoding': 'utf-8',
 'dialect': {'csv': {'skipInitialSpace': True}},
 'schema': {'fields': [{'name': 'id', 'type': 'integer'},
                       {'name': 'name', 'type': 'string'},
                       {'name': 'test_array', 'type': 'array', 'arrayItems': {'type': 'string'}}]}}