predicador37 / pyjstat

pyjstat is a python library for JSON-stat formatted data manipulation which allows reading and writing JSON-stat [1] format with python,using the DataFrame structures provided by the widely accepted pandas library
Apache License 2.0
29 stars 24 forks source link

Can not parse json-stat to Dataframe :: Using a json-stat generated by pyjstat :: There and (not) back again #36

Open acmeguy opened 1 year ago

acmeguy commented 1 year ago

Line 394: output = pd.DataFrame([category + [values[i]] for i, category in enumerate(get_df_row(dimensions, naming))]) output = pd.DataFrame([category + [values[i]]


IndexError: list index out of range

dim_names = ['timestamp', 'OBJECTNO']
acmeguy commented 1 year ago

len([i for i, category in enumerate(get_df_row(dimensions, naming))]), len(values) gives: (11274, 9360)

the enumerate is a lot longer than the values

acmeguy commented 1 year ago

This round trip fails:

js_dataset = pyjstat.Dataset.read(some_df) another_df = js_dataset.write('dataframe')

acmeguy commented 1 year ago

Another version of a failing roundtrip:

testcase

df_some = pd.DataFrame([
    {'Date': '2007-01-01', 'Variables': 'Gasolina 95 E5 Premium', 'value': 1.555},
    {'Date': '2007-01-01', 'Variables': 'Gasolina 98 E5 Premium', 'value': 1.681},
    {'Date': '2007-01-03', 'Variables': 'Gasolina 95 E5 Premium', 'value': 1.991},
    {'Date': '2007-01-03', 'Variables': 'Gasolina 98 E5 Premium', 'value': 1.991},
    {'Date': '2007-01-03', 'Variables': 'Gasolina 98 E5 Premium', 'value': 1.991}
])

test_a = pyjstat.Dataset.read(df_some)
test_a_js = test_a.write()
test_b = pyjstat.Dataset.read(test_a_js)
test_b_df = test_b.write('dataframe'). # fails