datahq / dataflows

DataFlows is a simple, intuitive lightweight framework for building data processing flows in python.
https://dataflows.org
MIT License
193 stars 39 forks source link

Error when dumping json files #150

Closed gperonato closed 3 years ago

gperonato commented 3 years ago

This error occurs TypeError: identity() takes 1 positional argument but 2 were given

gperonato commented 3 years ago

Provisional fix in #151

akariv commented 3 years ago

@gperonato could you please share the full stack trace and what you we're doing at the time? Looking at the code I can't really tell where 2 arguments might have been passed to identity()

gperonato commented 3 years ago

This is the stacktrace when I try to run the sample academy_csv.py with dump_to_path('academy_csv',format="json")

python academy_csv.py
academy:
ERROR:root:Failed to transform row {'Year': '1927/1928', 'Ceremony': 1, 'Award': 'Actor', 'Winner': None, 'Name': 'Richard Barthelmess', 'Film': 'The Noose'}
Traceback (most recent call last):
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 56, in __transform_row
    return dict((k, self.__transform_value(v, self.fields[k]))
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 56, in <genexpr>
    return dict((k, self.__transform_value(v, self.fields[k]))
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 69, in __transform_value
    return field.descriptor['serializer'](value)
TypeError: identity() takes 1 positional argument but 2 were given
Traceback (most recent call last):
  File "dataflows/dataflows/base/datastream_processor.py", line 108, in safe_process
    collections.deque(res, maxlen=0)
  File "dataflows/dataflows/processors/dumpers/dumper_base.py", line 69, in row_counter
    for row in iterator:
  File "dataflows/dataflows/processors/dumpers/file_dumper.py", line 78, in rows_processor
    writer.write_row(row)
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 75, in write_row
    transformed_row = self.__transform_row(row)
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 56, in __transform_row
    return dict((k, self.__transform_value(v, self.fields[k]))
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 56, in <genexpr>
    return dict((k, self.__transform_value(v, self.fields[k]))
  File "dataflows/dataflows/processors/dumpers/file_formats.py", line 69, in __transform_value
    return field.descriptor['serializer'](value)
TypeError: identity() takes 1 positional argument but 2 were given

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "academy_csv.py", line 21, in <module>
    academy_csv()
  File "academy_csv.py", line 17, in academy_csv
    flow.process()
  File "dataflows/dataflows/base/flow.py", line 15, in process
    return self._chain().process()
  File "dataflows/dataflows/base/datastream_processor.py", line 117, in process
    ds, _ = self.safe_process()
  File "dataflows/dataflows/base/datastream_processor.py", line 113, in safe_process
    self.raise_exception(exception)
  File "dataflows/dataflows/base/datastream_processor.py", line 95, in raise_exception
    raise error from cause
dataflows.base.exceptions.ProcessorError: Errored in processor PathDumper in position #4: identity() takes 1 positional argument but 2 were given
akariv commented 3 years ago

Thanks @gperonato

This was fixed via https://github.com/datahq/dataflows/commit/f57d43afbe43387fa200aa29c329b9ecf31ac822